The purpose of this mini project is for you to practice making
animated visuals and map plots. The library we will use is called
gapminder and this must be installed. You will see this
data used later in the textbook and in class. Below, you will find a
preview of the data. We have the life expectancy, population, and GDP
per capita for many countries measured every 5 years from 1952 to 2007.
The dataset called gapminder is good for practice since it
is used in many textbooks and in many of the online resources you will
find below. The data has quite a reputation behind it. (See https://www.gapminder.org/)
If you see “#DO NOT CHANGE”, these lines of code are designed for you
to run and examine, but do not change them in any way. Also, make sure
you install the gganimate library which will be necessary
for making animated plots. Also, I had to install the
gifski, av, and magick libraries
since my animated plots would not create.
data(gapminder) #DO NOT CHANGE
head(gapminder) #DO NOT CHANGE
## # A tibble: 6 × 6
## country continent year lifeExp pop gdpPercap
## <fct> <fct> <int> <dbl> <int> <dbl>
## 1 Afghanistan Asia 1952 28.8 8425333 779.
## 2 Afghanistan Asia 1957 30.3 9240934 821.
## 3 Afghanistan Asia 1962 32.0 10267083 853.
## 4 Afghanistan Asia 1967 34.0 11537966 836.
## 5 Afghanistan Asia 1972 36.1 13079460 740.
## 6 Afghanistan Asia 1977 38.4 14880372 786.
In each section, I will ask you to make a visual with the data. Below you will find links of reading material that I used in the creation of this assignment. You may still need to search for other resources You can work with each other on this assignment to help each other out.
Helpful Reading Material:
In this section, I want you to make a histogram that shows the
distribution of one of the three variables
(lifeExp,pop, or gdpPercap) across the
countries. This histogram should be animated based off the year
variable. As the year changes, the visual should change. You will see
similar examples in the reading material with scatter plots. You need to
use the function transition_states() rather than
transition_time() since the latter will show a plot for
every year from 1952 to 2007 and we only have data every 5 years. Also,
play around with the transition_length and
state_length arguments to slow down the speed of
transitions.
I want there to be a title of your plot like “Distribution of VVVV in XXXX”. The XXXX in the title should update based off the year variable (i.e. 1952, 1957,…) and the VVVV should reflect the variable you choose (i.e. Life Expectancy,Population,GDP per Capita). I also want you to center this title with the picture and bold the text. By default, the title of a plot is on the left side of the graphic and is not bold.
Remove the x-label and rename the y-label to be “Frequency”.
Also, make the color of the the lines on the rectangles and the color in the filling of the rectangles in the histogram two different colors so your audience can see the histogram broken down into rectangles.
#
In this section, I want you to pick one of the three variables (lifeExp,pop, or gdpPercap) and create a line plot that shows the change in these variables over time. The variable year should be on the x-axis. Label your x-axis “Year” and remove the label for the y-axis.
This line plot should be animated to loop through 5 different
countries of your choice. Since the variable country is a
factor variable, the transition states based off the country
variable will default to looping through the 142 different levels.
Therefore, you will first need to create a new dataset that filters the
data for the 5 countries of interest and then converts the country
variable from a factor variable to a basic character variable. Use the
as.character() function to do this. Then, when you create
the animated plot, the plot should just animate based on the 5 countries
you selected.
Make your transition_length=5 and
state_length=5. Also, I want your title to be “Plot of XXXX
over Time for YYYY” where XXXX is the variable you picked written out
(“Life Expectancy”, “Population”, or “GDP per Capita”) and XXXX is the
name of the country that is being looped over.
#
In this section, I want you to create a world map plot showing the change in gdpPercap from 1952 to 2007 for all countries in the gapminder tibble. You need to start using your data cleaning skills to create a tibble that has the two variables country and gdpPercapChange where country records the name of the country and gdpPercapChange is the GDP Per Capita in 2007 minus the GDP Per Capita in 1952. Rename the variable country to region to help with a future merge.
Then, you will need to consult the appropriate reading material and follow the steps to create a world map plot of gdpPercapChange similar to the HDI plot seen. Initially, ignore cleaning the data and just try to get a world map visual of gdpPercapChange with a title “Change in GDP Per Capita Between 1952 and 2007” in replacement of “Global Human Development Index (HDI)”.
After you get the initial visual, you will notice that the value for
the United States is not plotting even though the United States is
clearly in the gapminder dataset. This is because the
output from map_data("world") has the United States listed
as “USA”. If you use the setdiff function to compare the regions in your
cleaned dataset to the regions in the dataset created after the merge,
you will find out that there are 12 regions in total that cannot be
plotted for this same reason. I want you to redo the plot after fixing
the names of all of these regions except for “Hong Kong, China” and
“Trinidad and Tobago” since these two would require a little more work
and thought. I recommend using the fct_recode() function on
the cleaned gapminder data since the original variable country
which was renamed to region is a factor variable in R. I
recommend searching Google for alternative names (North Korea vs South
Korea) and using the output from `map_data(“world”) to identify the new
names of the 10 regions you are fixing. This will take patience and
time. Start with figuring out how to fix the plot for the United States
and then proceed with the other 9 countries.
Your output after you are done with all of this should be a map plot that looks identical to the HDI plot in the reading material. Don’t be surprised that Russia is not in the visual considering it is not listed in the original gapminder dataset.
Finally there are two changes I want you to make to the code in the reading material. I want you to use the color black to outline the polygons in each of the regions since the colors blend together in many areas, and I want you to use the “Spectral” palette to create a better gradient for the legend.
I only want you to output one graphic in this code. It should have everything I asked for in the prompt.
#
| Task | Points |
|---|---|
| Histogram Animated by Time: Histogram | 1 Points |
| Histogram Animated by Time: Updated Every 5 Years | 1 Points |
| Histogram Animated by Time: Changed Color of Lines and Filling | 1 Points |
| Histogram Animated by Time: y-label “Frequency” | 1 Points |
| Histogram Animated by Time: x-label NONE | 1 Points |
| Histogram Animated by Time: Title Updates with Year | 1 Points |
| Histogram Animated by Time: Title is Centered | 1 Points |
| Histogram Animated by Time: Title is Bold | 1 Points |
| Line Plot Animated by Country: Line Plot | 1 Points |
| Line Plot Animated by Country: Axis Labels Correct | 1 Points |
| Line Plot Animated by Country: 5 Countries | 1 Points |
| Line Plot Animated by Country: Transition Timing Correct | 1 Points |
| Line Plot Animated by Country: Title is Correct | 1 Points |
| World Map Plot: You Created a Map Plot with Correct Variable | 3 Points |
| World Map Plot: Fixed Data for All 10 Countries | 5 Points |
| World Map Plot: Map Plot Corrected for 10 Countries | 1 Points |
| World Map Plot: Title is Correct | 1 Points |
| World Map Plot: Black Lines Around Countries | 1 Points |
| World Map Plot: Spectral Palette Used | 1 Points |
| World Map Plot: Only 1 Plot in Output | 1 Points |