Exercises: 1-5 (Pgs. 6-7); 1-2, 5 (Pg. 12); 1-5 (Pgs. 20-21); Open Response
Submission: Submit via an electronic document on Sakai. Must be submitted as a HTML file generated in RStudio. All assigned problems are chosen according to the textbook R for Data Science. You do not need R code to answer every question. If you answer without using R code, delete the code chunk. If the question requires R code, make sure you display R code. If the question requires a figure, make sure you display a figure. A lot of the questions can be answered in written response, but require R code and/or figures for understanding and explaining.
ggplot(data=mpg)
I see absolutely nothing. There is just a blank space for a graph. Why am I even doing this nonsense?
dim(mpg)
## [1] 234 11
nrow(mpg)
## [1] 234
ncol(mpg)
## [1] 11
There are 234 rows and 11 columns in the dataset mpg.
?mpg
unique(mpg$drv)
## [1] "f" "4" "r"
The variable drg is a factor variable that takes the following values:
ggplot(data=mpg,aes(x=hwy,y=cyl)) +
geom_point() +
xlab("Highway Miles Per Gallon") +
ylab("Number of Cylinders")
ggplot(data=mpg,aes(x=class,y=drv)) +
geom_point() +
xlab("Type of Car") +
ylab("Type of Drive")
Scatter plots are not meant to visualize the relationship between two categorical/qualitative variables.
#
#
#
#
#
#
#
I don’t know if they will look different. Let me check.
ggplot(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_point() +
geom_smooth()
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
ggplot() +
geom_point(data = mpg, mapping = aes(x = displ, y = hwy)) +
geom_smooth(data = mpg, mapping = aes(x = displ, y = hwy))
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
They do not look different. I am incredibly surprised.
For this exercise, use the diamonds dataset in the
tidyverse. Use ?diamonds
to get more information about the
dataset.
geom_boxplot()
and
facet_wrap
to illustrate the empirical distributions of the
sample.