Assignment 7
PS 3780 Data Literacy & Visualization, Summer 2022
Due Date: Thursday, July 7, 2022 at 11:59 p.m.
Please save your visualizations and answers to these questions as one .pdf
�le (use the �save as� function in most word processors). Be sure to include your name,
your teammate's name if there is anyone, and the assignment number. Submit the �le
to Carmen by the due date. Remember we are looking for professional visualizations so
please include a meaningful title as well as axis labels and a legend.
Part I: API and World Bank
1. Apply the World Bank API to extract female life expectancy data. Compare the
trends of the United States, the European Union, the entire World (this group is
included in the data) as well as a fourth country or collection. Write a paragraph
to describe the plot you created and explain which fourth option you included and
why. Make sure to include axis labels, a title, and a legend for your plot. (2 pts)
Some hints:
Use WDI() command from WDI package to implement the World Bank API,
and set indicator = �SP.DYN.LE00.FE.IN� in the parentheses. You can also
truncate data by setting �start = � and �end = �.
If using plot( ) , create appropriate subsets and add lines to an initial plot.
Set the ylim to ensure that all cases are visible.
2. Repeat the prior question for a World Bank indicator of your choice - please choose
one for which there is enough data! Include the same countries / groups as in the
previous plot. In addition to describing the plot, make sure you de�ne the indicato
you used, with what unit it is measured, and why you chose it. Make sure to include
axis labels, a title, and a legend for your plot. (3 pts)
Some hints:
Use WDIsearch() to look for a particular indicator available from the World
Bank API.
Most of the plotting code from the �rst part should work as long as you change
the appropriate dataset name, variable name, and ylim values.
1
Part 2: Endangered Species
Download from the Carmen the endangered_by_state.csv �le. Create and discuss the
two visualizations described below. Include copies of the graphs and the code used to
create them.
Map
Create a choropleth map indicating the amount of endangered species of *All* types
within each state while changing the default color scheme. Which states have the most
endangered species? Is there any geographic clustering? (3 pt)
Some hints:
What does �All" mean - check the Organism.type variable.
How do you want to de�ne colors? Use RColorBrewer.
Load the map data using map_data(�state�).
Use the scale_fill_distiller( ) option and specify palette = to choose you
colorscheme.
Boxplot
Create a boxplot indicating the count of endangered species of speci�c types across the
states. In the one graph, show boxplots for the organism types of Bird, Mammal, Plant,
and Reptile. Fill the boxes with di�erent colors. What di�erences and similarities stand
out between the species? (2 pt)
2
ec 7-1 - WDI API, Linegraph, Boxplot, Map
############ Better colors
#install.packages("RColorBrewer")
li
ary(RColorBrewer)
display.
ewer.all()
display.
ewer.pal(5, "Blues")
ewer.pal.info
myclrs <- colorRampPalette(c("red", "yellow", "green"))(5)
scales::show_col(myclrs)
############# Making sure all packages are loaded
#install.packages("ggplot2")
li
ary(ggplot2)
#install.packages("WDI")
li
ary(WDI)
############# The WDI API
#This command prints the data, how do we save it to use later?
#WDI(country = "all", indicator = "NY.GNS.ICTR.GN.ZS", start= 1980,
end= 2020)
sample <- WDI(country = "all", indicator = "ST.INT.RCPT.XP.ZS" ,
start= 1995, end= 2019)
#Notes about the function:
## To indicate a particular country, you need to use ISO-2 country
code or download all the data and subset afte
## There are tons of different indicators available - we can use the
search function
## Be aware that for many indicators there will be a bunch of missing
data, so choose your start and end years with consideration
#What does that indicator mean and how can we find what we want?
WDIsearch("tourism") #Wouldn't we all like to be tourists right now
WDIsearch("gross savings") #There we see the example used. What if we
want something else?
############# Line Plots
# For ggplot() function:
##### Make one dataset with all countries you want to plot
##### ggplot() will separate the countries when you tell it to
##### ggplot() will also make sure the limits are appropriate
subset <-sample[sample$country %in% c("Australia", "Portugal",
"Thailand", "Jamaica", "Egypt, Arab Rep."),]
p <- ggplot(data = subset)
p + geom_line(aes(x=year, y=ST.INT.RCPT.XP.ZS, color = country, lty =
country), lwd=3) +
labs(title = "Tourism Reliance", x="Year", y= "Tourism (% GDP)",
color = "Country", lty="Country")
p + geom_line(aes(x=year, y=ST.INT.RCPT.XP.ZS, color = country, lty =
country), lwd=3) +
scale_color_
ewer(palette = "Dark2") +
labs(title = "Tourism Reliance", x="Year", y= "Tourism (% GDP)",
color = "Country", lty="Country")
p + geom_line(aes(x=year, y=ST.INT.RCPT.XP.ZS, color = country, lty =
country), lwd=3) +
scale_color_manual(values = myclrs) +
labs(title = "Tourism Reliance", x="Year", y= "Tourism (% GDP)",
color = "Country", lty="Country")
############ Endangered Species
setwd("C:/Users/dadada135/Desktop/PS XXXXXXXXXXData Literacy/datasets") #
set working directory so R knows where to look for files
# CUSTOMIZE this line for your own compute
end <- read.csv("endangered_by_state.csv")
############# Exploring the dataset
summary(end)
table(end$Organism.Type)
irds <- end[end$Organism.Type %in% c("Bird"),]
############# Boxplot
<- ggplot(data=birds)
+ geom_boxplot(aes(x = Organism.Type, y = Count,
fill=Organism.Type))
############# Choropleth Map
#install.packages("maps")
li
ary(maps)
map_data("state")
m <- ggplot(data = birds)
#### Initialize the data
#### Call geom_map and tell it which map and which variable to plot
#### Make sure the map fits
#### Set the color palette
#### Turn off axes
m +
geom_map(color="black", aes(map_id = Name, fill = Count), map =
map_data("state")) +
expand_limits(x = map_data("state")$long, y = map_data("state")$lat)
+
scale_fill_distiller(palette = "PuRd", direction =1) +
labs(x="", y= "") +
theme(axis.text.x = element_blank(), axis.ticks.x = element_blank(),
XXXXXXXXXXaxis.text.y = element_blank(), axis.ticks.y = element_blank())
m +
geom_map(color="black", aes(map_id = Name, fill = Count), map =
map_data("state")) +
expand_limits(x = map_data("state")$long, y = map_data("state")$lat)
+
scale_fill_gradient(low = "green", high = "blue") +
labs(x="", y= "") +
theme(axis.text.x = element_blank(), axis.ticks.x = element_blank(),
XXXXXXXXXXaxis.text.y = element_blank(), axis.ticks.y = element_blank())
Assignment 7
PS 3780 Data Literacy & Visualization, Summer 2022
Due Date: Thursday, July 7, 2022 at 11:59 p.m.
Please save your visualizations and answers to these questions as one .pdf
�le (use the �save as� function in most word processors). Be sure to include your name,
your teammate's name if there is anyone, and the assignment number. Submit the �le
to Carmen by the due date. Remember we are looking for professional visualizations so
please include a meaningful title as well as axis labels and a legend.
Part I: API and World Bank
1. Apply the World Bank API to extract female life expectancy data. Compare the
trends of the United States, the European Union, the entire World (this group is
included in the data) as well as a fourth country or collection. Write a paragraph
to describe the plot you created and explain which fourth option you included and
why. Make sure to include axis labels, a title, and a legend for your plot. (2 pts)
Some hints:
Use WDI() command from WDI package to implement the World Bank API,
and set indicator = �SP.DYN.LE00.FE.IN� in the parentheses. You can also
truncate data by setting �start = � and �end = �.
If using plot( ) , create appropriate subsets and add lines to an initial plot.
Set the ylim to ensure that all cases are visible.
2. Repeat the prior question for a World Bank indicator of your choice - please choose
one for which there is enough data! Include the same countries / groups as in the
previous plot. In addition to describing the plot, make sure you de�ne the indicato
you used, with what unit it is measured, and why you chose it. Make sure to include
axis labels, a title, and a legend for your plot. (3 pts)
Some hints:
Use WDIsearch() to look for a particular indicator available from the World
Bank API.
Most of the plotting code from the �rst part should work as long as you change
the appropriate dataset name, variable name, and ylim values.
1
Part 2: Endangered Species
Download from the Carmen the endangered_by_state.csv �le. Create and discuss the
two visualizations described below. Include copies of the graphs and the code used to
create them.
Map
Create a choropleth map indicating the amount of endangered species of *All* types
within each state while changing the default color scheme. Which states have the most
endangered species? Is there any geographic clustering? (3 pt)
Some hints:
What does �All" mean - check the Organism.type variable.
How do you want to de�ne colors? Use RColorBrewer.
Load the map data using map_data(�state�).
Use the scale_fill_distiller( ) option and specify palette = to choose you
colorscheme.
Boxplot
Create a boxplot indicating the count of endangered species of speci�c types across the
states. In the one graph, show boxplots for the organism types of Bird, Mammal, Plant,
and Reptile. Fill the boxes with di�erent colors. What di�erences and similarities stand
out between the species? (2 pt)
2