Instructions:

The purpose of this mini project is for you to practice making animated visuals and map plots. The library we will use is called gapminder and this must be installed. You will see this data used later in the textbook and in class. Below, you will find a preview of the data. We have the life expectancy, population, and GDP per capita for many countries measured every 5 years from 1952 to 2007. The dataset called gapminder is good for practice since it is used in many textbooks and in many of the online resources you will find below. The data has quite a reputation behind it. (See https://www.gapminder.org/)

If you see “#DO NOT CHANGE”, these lines of code are designed for you to run and examine, but do not change them in any way. Also, make sure you install the gganimate library which will be necessary for making animated plots. Also, I had to install the gifski, av, and magick libraries since my animated plots would not create.

data(gapminder) #DO NOT CHANGE
head(gapminder) #DO NOT CHANGE
## # A tibble: 6 × 6
##   country     continent  year lifeExp      pop gdpPercap
##   <fct>       <fct>     <int>   <dbl>    <int>     <dbl>
## 1 Afghanistan Asia       1952    28.8  8425333      779.
## 2 Afghanistan Asia       1957    30.3  9240934      821.
## 3 Afghanistan Asia       1962    32.0 10267083      853.
## 4 Afghanistan Asia       1967    34.0 11537966      836.
## 5 Afghanistan Asia       1972    36.1 13079460      740.
## 6 Afghanistan Asia       1977    38.4 14880372      786.

In each section, I will ask you to make a visual with the data. Below you will find links of reading material that I used in the creation of this assignment. You may still need to search for other resources You can work with each other on this assignment to help each other out.

Helpful Reading Material:

Histogram Animated by Time

In this section, I want you to make a histogram that shows the distribution of one of the three variables (lifeExp,pop, or gdpPercap) across the countries. This histogram should be animated based off the year variable. As the year changes, the visual should change. You will see similar examples in the reading material with scatter plots. You need to use the function transition_states() rather than transition_time() since the latter will show a plot for every year from 1952 to 2007 and we only have data every 5 years. Also, play around with the transition_length and state_length arguments to slow down the speed of transitions.

I want there to be a title of your plot like “Distribution of VVVV in XXXX”. The XXXX in the title should update based off the year variable (i.e. 1952, 1957,…) and the VVVV should reflect the variable you choose (i.e. Life Expectancy,Population,GDP per Capita). I also want you to center this title with the picture and bold the text. By default, the title of a plot is on the left side of the graphic and is not bold.

Remove the x-label and rename the y-label to be “Frequency”.

Also, make the color of the the lines on the rectangles and the color in the filling of the rectangles in the histogram two different colors so your audience can see the histogram broken down into rectangles.

#

Line Plot Animated by Country

In this section, I want you to pick one of the three variables (lifeExp,pop, or gdpPercap) and create a line plot that shows the change in these variables over time. The variable year should be on the x-axis. Label your x-axis “Year” and remove the label for the y-axis.

This line plot should be animated to loop through 5 different countries of your choice. Since the variable country is a factor variable, the transition states based off the country variable will default to looping through the 142 different levels. Therefore, you will first need to create a new dataset that filters the data for the 5 countries of interest and then converts the country variable from a factor variable to a basic character variable. Use the as.character() function to do this. Then, when you create the animated plot, the plot should just animate based on the 5 countries you selected.

Make your transition_length=5 and state_length=5. Also, I want your title to be “Plot of XXXX over Time for YYYY” where XXXX is the variable you picked written out (“Life Expectancy”, “Population”, or “GDP per Capita”) and XXXX is the name of the country that is being looped over.

#

World Map Visual

In this section, I want you to create a world map plot showing the change in gdpPercap from 1952 to 2007 for all countries in the gapminder tibble. You need to start using your data cleaning skills to create a tibble that has the two variables country and gdpPercapChange where country records the name of the country and gdpPercapChange is the GDP Per Capita in 2007 minus the GDP Per Capita in 1952. Rename the variable country to region to help with a future merge.

Then, you will need to consult the appropriate reading material and follow the steps to create a world map plot of gdpPercapChange similar to the HDI plot seen. Initially, ignore cleaning the data and just try to get a world map visual of gdpPercapChange with a title “Change in GDP Per Capita Between 1952 and 2007” in replacement of “Global Human Development Index (HDI)”.

After you get the initial visual, you will notice that the value for the United States is not plotting even though the United States is clearly in the gapminder dataset. This is because the output from map_data("world") has the United States listed as “USA”. If you use the setdiff function to compare the regions in your cleaned dataset to the regions in the dataset created after the merge, you will find out that there are 12 regions in total that cannot be plotted for this same reason. I want you to redo the plot after fixing the names of all of these regions except for “Hong Kong, China” and “Trinidad and Tobago” since these two would require a little more work and thought. I recommend using the fct_recode() function on the cleaned gapminder data since the original variable country which was renamed to region is a factor variable in R. I recommend searching Google for alternative names (North Korea vs South Korea) and using the output from `map_data(“world”) to identify the new names of the 10 regions you are fixing. This will take patience and time. Start with figuring out how to fix the plot for the United States and then proceed with the other 9 countries.

Your output after you are done with all of this should be a map plot that looks identical to the HDI plot in the reading material. Don’t be surprised that Russia is not in the visual considering it is not listed in the original gapminder dataset.

Finally there are two changes I want you to make to the code in the reading material. I want you to use the color black to outline the polygons in each of the regions since the colors blend together in many areas, and I want you to use the “Spectral” palette to create a better gradient for the legend.

I only want you to output one graphic in this code. It should have everything I asked for in the prompt.

#

Rubric

Task Points
Histogram Animated by Time: Histogram 1 Points
Histogram Animated by Time: Updated Every 5 Years 1 Points
Histogram Animated by Time: Changed Color of Lines and Filling 1 Points
Histogram Animated by Time: y-label “Frequency” 1 Points
Histogram Animated by Time: x-label NONE 1 Points
Histogram Animated by Time: Title Updates with Year 1 Points
Histogram Animated by Time: Title is Centered 1 Points
Histogram Animated by Time: Title is Bold 1 Points
Line Plot Animated by Country: Line Plot 1 Points
Line Plot Animated by Country: Axis Labels Correct 1 Points
Line Plot Animated by Country: 5 Countries 1 Points
Line Plot Animated by Country: Transition Timing Correct 1 Points
Line Plot Animated by Country: Title is Correct 1 Points
World Map Plot: You Created a Map Plot with Correct Variable 3 Points
World Map Plot: Fixed Data for All 10 Countries 5 Points
World Map Plot: Map Plot Corrected for 10 Countries 1 Points
World Map Plot: Title is Correct 1 Points
World Map Plot: Black Lines Around Countries 1 Points
World Map Plot: Spectral Palette Used 1 Points
World Map Plot: Only 1 Plot in Output 1 Points