This is the default R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
For our class we will always, Knit to HTML!!!
For more assistance with RMarkdown, see Chapter 21 in R for Data Science and the RMarkdown cheat sheet at https://www.rstudio.com/wp-content/uploads/2016/03/rmarkdown-cheatsheet-2.0.pdf, which link is also found on the course website.
msleep #Prints the data but takes up a lot of space
## # A tibble: 83 × 11
## name genus vore order conse…¹ sleep…² sleep…³ sleep…⁴ awake brainwt
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Cheetah Acin… carni Carn… lc 12.1 NA NA 11.9 NA
## 2 Owl monkey Aotus omni Prim… <NA> 17 1.8 NA 7 0.0155
## 3 Mountain be… Aplo… herbi Rode… nt 14.4 2.4 NA 9.6 NA
## 4 Greater sho… Blar… omni Sori… lc 14.9 2.3 0.133 9.1 0.00029
## 5 Cow Bos herbi Arti… domest… 4 0.7 0.667 20 0.423
## 6 Three-toed … Brad… herbi Pilo… <NA> 14.4 2.2 0.767 9.6 NA
## 7 Northern fu… Call… carni Carn… vu 8.7 1.4 0.383 15.3 NA
## 8 Vesper mouse Calo… <NA> Rode… <NA> 7 NA NA 17 NA
## 9 Dog Canis carni Carn… domest… 10.1 2.9 0.333 13.9 0.07
## 10 Roe deer Capr… herbi Arti… lc 3 NA NA 21 0.0982
## # … with 73 more rows, 1 more variable: bodywt <dbl>, and abbreviated variable
## # names ¹conservation, ²sleep_total, ³sleep_rem, ⁴sleep_cycle
head(msleep,5) #Prints the first 5 rows
## # A tibble: 5 × 11
## name genus vore order conse…¹ sleep…² sleep…³ sleep…⁴ awake brainwt bodywt
## <chr> <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Chee… Acin… carni Carn… lc 12.1 NA NA 11.9 NA 50
## 2 Owl … Aotus omni Prim… <NA> 17 1.8 NA 7 0.0155 0.48
## 3 Moun… Aplo… herbi Rode… nt 14.4 2.4 NA 9.6 NA 1.35
## 4 Grea… Blar… omni Sori… lc 14.9 2.3 0.133 9.1 0.00029 0.019
## 5 Cow Bos herbi Arti… domest… 4 0.7 0.667 20 0.423 600
## # … with abbreviated variable names ¹conservation, ²sleep_total, ³sleep_rem,
## # ⁴sleep_cycle
str(msleep) #Lists all variables and the type of variable
## tibble [83 × 11] (S3: tbl_df/tbl/data.frame)
## $ name : chr [1:83] "Cheetah" "Owl monkey" "Mountain beaver" "Greater short-tailed shrew" ...
## $ genus : chr [1:83] "Acinonyx" "Aotus" "Aplodontia" "Blarina" ...
## $ vore : chr [1:83] "carni" "omni" "herbi" "omni" ...
## $ order : chr [1:83] "Carnivora" "Primates" "Rodentia" "Soricomorpha" ...
## $ conservation: chr [1:83] "lc" NA "nt" "lc" ...
## $ sleep_total : num [1:83] 12.1 17 14.4 14.9 4 14.4 8.7 7 10.1 3 ...
## $ sleep_rem : num [1:83] NA 1.8 2.4 2.3 0.7 2.2 1.4 NA 2.9 NA ...
## $ sleep_cycle : num [1:83] NA NA NA 0.133 0.667 ...
## $ awake : num [1:83] 11.9 7 9.6 9.1 20 9.6 15.3 17 13.9 21 ...
## $ brainwt : num [1:83] NA 0.0155 NA 0.00029 0.423 NA NA NA 0.07 0.0982 ...
## $ bodywt : num [1:83] 50 0.48 1.35 0.019 600 ...
summary(msleep) #Provides summary statistics for all variables in dataset
## name genus vore order
## Length:83 Length:83 Length:83 Length:83
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## conservation sleep_total sleep_rem sleep_cycle
## Length:83 Min. : 1.90 Min. :0.100 Min. :0.1167
## Class :character 1st Qu.: 7.85 1st Qu.:0.900 1st Qu.:0.1833
## Mode :character Median :10.10 Median :1.500 Median :0.3333
## Mean :10.43 Mean :1.875 Mean :0.4396
## 3rd Qu.:13.75 3rd Qu.:2.400 3rd Qu.:0.5792
## Max. :19.90 Max. :6.600 Max. :1.5000
## NA's :22 NA's :51
## awake brainwt bodywt
## Min. : 4.10 Min. :0.00014 Min. : 0.005
## 1st Qu.:10.25 1st Qu.:0.00290 1st Qu.: 0.174
## Median :13.90 Median :0.01240 Median : 1.670
## Mean :13.57 Mean :0.28158 Mean : 166.136
## 3rd Qu.:16.15 3rd Qu.:0.12550 3rd Qu.: 41.750
## Max. :22.10 Max. :5.71200 Max. :6654.000
## NA's :27
summary(msleep$awake) #Provides summary statistics for the awake variable in dataset msleep
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.10 10.25 13.90 13.57 16.15 22.10
dim(msleep) #Outputs a Vector Giving the Number of Rows and Columns
## [1] 83 11
unique(msleep$vore) #Lists all the unique values for a categorical variable Animals are Classified as Carnivore Omnivore, Herbivore, or Insectivore: "NA" references a missing response
## [1] "carni" "omni" "herbi" NA "insecti"
which(is.na(msleep$vore)) #Returns the Observation index where missing values exist
## [1] 8 55 57 58 63 69 73
msleep2=msleep[-which(is.na(msleep$vore)),] #Removes the 7 Observations that are missing a vore-specification
In this dataset, there are 83 observations and 11 variables.
ggplot(data=msleep2) +
geom_bar(aes(x=vore))
ggplot(data=msleep2) +
geom_bar(aes(x=vore),color="dimgrey",fill="deepskyblue1",size=2) +
xlab("Type of Vore") + ylab("Frequency") +
theme_classic()
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
ggplot(data=msleep2) +
geom_histogram(mapping=aes(x=sleep_total),bins=15,fill="deepskyblue1") +
geom_histogram(mapping=aes(x=sleep_rem),bins=15,fill="white",alpha=0.5) +
labs(x="Sleep Total",y="Frequency",title="Overlayed Histograms") + theme_dark()
## Warning: Removed 20 rows containing non-finite values (`stat_bin()`).
#Warning due to NA
ggplot(data=msleep2) +
geom_boxplot(aes(x=vore,y=awake),fill=c("red","blue","green","purple")) +
xlab("Type of Vore") + ylab("Time Awake (Hrs)") +
theme_light()+ggtitle("Stratified Boxplots") +
scale_x_discrete(labels=c("Carnivore","Herbivore","Insectivore","Omnivore"))
ggplot(data=msleep2) +
geom_boxplot(aes(x=vore,y=awake,color=conservation)) +
xlab("Type of Vore") + ylab("Time Awake (Hrs)") +
theme_light()+ggtitle("Stratified Stratified Boxplots") +
scale_x_discrete(limits=c("carni","herbi"),labels=c("Carnivore","Herbivore")) +
guides(color=guide_legend(title="Conservation \nStatus")) +
theme_classic()
## Warning: Removed 25 rows containing missing values (`stat_boxplot()`).
ggplot(data=msleep2) +
geom_boxplot(aes(x=vore,y=awake)) +
facet_wrap(conservation~.) +
xlab("Type of Vore") + ylab("Time Awake (Hrs)") +
theme_light()+ggtitle("Separated Stratified Boxplots") +
scale_x_discrete(limits=c("carni","herbi"),labels=c("Carnivore","Herbivore")) +
theme_test()
## Warning: Removed 25 rows containing missing values (`stat_boxplot()`).
ggplot(data=msleep2,aes(x=vore,y=conservation)) +
geom_tile(aes(fill=sleep_total)) +
scale_fill_gradient(low="deepskyblue1",high="white")+
theme_classic() +
scale_x_discrete(label=c("Carnivore","Herbivore","Insectivore","Omnivore")) +
theme(axis.text.x=element_text(angle=45,vjust=0.5))+
xlab("")+ylab("") +
ggtitle("Total Sleep for Combinations of Conservation Status and Diet")
The next example can be found at https://ggplot2.tidyverse.org/reference/scale_brewer.html.
These examples are based on the classic Old Faithful data set. The data
set provides the joint probability distribution of waiting time between
eruptions and the duration of the eruptions. The original data set
faithful
contains sample data from monitoring the famous
geyser Old Faithful. The data set faithfuld
from
ggplot2 provides emperical joint density estimates for
relationship between these two variables.
#First Notice from Original Old Faithful Data Sets
ggplot(faithful) +
geom_point(aes(x=waiting,y=eruptions),col="black")
#Now we Construct a Heatmap Showing the
v <- ggplot(faithfuld) +
geom_tile(aes(waiting, eruptions, fill = density))
v
v2=v + scale_fill_distiller()
v2
v3=v+scale_fill_distiller(palette = "Spectral")
v3
v4=v3 + xlab("Time Between Eruptions (mins)") + ylab("Duration of Eruptions (mins)") +
ggtitle("Old Faithful") + labs(subtitle=expression(paste("Joint Density Function: ",italic("f(Waiting Time,Duration)"))))
v4