Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Final Project PS 3780 Data Literacy & Visualization, Autumn 2021 Due Date: Monday, December 13, 2021 at 11:59 p.m. Final Project Description This is an individual assignment. Your �nal project for...

1 answer below »

Final Project
PS 3780 Data Literacy & Visualization, Autumn 2021
Due Date: Monday, December 13, 2021 at 11:59 p.m.
Final Project Description
This is an individual assignment. Your �nal project for this class involves answering an
interesting question or testing an interesting theory using data visualization. In the �nal
paper, you will do the following:
1. State the question or theory explicitly and explain why you �nd it interesting in
the introduction.
2. State one or up to three hypotheses that you derive from your theory to make
empirical test.
3. Explain why the data you examine will help you to test your hypothesis/-es. De-
scribe the data and where you obtained them, what (if anything) you did to reformat
or transform them, how you analyzed them, and what they told you.
4. Create and include at least two (2) unique visualizations (maximum of 4). Fo
maximum credit, at least two (2) visualizations should be made using R. All visu-
alizations should be made using programs or websites that we learned in this class.
(A list is included below; this does not include Excel).
5. What do you know that you did not know before? Does the answer raise furthe
questions that might be worth investigating? If so, describe them
ie�y.
I anticipate the text portion of papers to be 3 - 4 pages long (before adding visualizations),
double-space, no smaller than 11-point type and 1-inch margins, with in text visualiza-
tions (but no larger than 1/4-page each), though succinct writers may take less space
and those with more complex problems or answers may take more. Papers should be
professional in quality: page numbers, formatting, and paper organization all count, with
citations either in text or in footnotes with a works cited page at the end. (Works cited
pages do not count toward the total page count.) Unless you collected the data yourself,
e sure to cite your data sources! Due to the University's strict timeline for �nal grades,
no extensions can be o�ered except in case of genuine emergency. We look forward to
eceiving your best e�ort by 11:59 p.m. on August 2. You will submit the �nal
paper to Carmen as well as the .csv �le(s) of your data and any R code that
you used to generate the visualizations within your paper.
1
Tips for Moving Forward
Imagine the Would-be World
Given the hypotheses that you have proposed, imagine the state of the world that would
exist if the hypotheses were true as well as the state of the world where the opposite
(your alternative hypotheses so to speak) was true. What evidence would be seen in both
cases? Having such expectations will not only help you �nd appropriate data to test
your hypotheses but also give you the hints about whether your original hypotheses are
favored by the empirical evidence from visualization.
Collect Data
Use your hypotheses from above to begin searching for data. For this part of the project,
focus in particular on specifying how you will measure the di�erent variables speci�ed
y your hypothesis/-es. For example, if your argument is that democracies do not �ght
one another, you will need to �gure out how you will measure both democracy and
international con�ict. Once you have accomplished this, you can begin searching for and
collecting data on these variables. Toward this end, it may help you to do the following:
1. Write a paragraph explaining what the relevant variables for your question are,
eing as speci�c as possible (including the relevant time frame for your question,
the relevant states/countries, etc.).
2. Find and download data measuring all of the variables needed to answer you
esearch question. Save this data as a .csv �le or �les.
3. Explain why the data you found will help you answer the question. Here you should
describe the data in detail and defend your decision to use it by explaining why it
is relevant to the question and why you trust it to be credible information. Make
sure you answer these questions: Where do the data come from? What do they tell
us generally? What is and is not measured? How is it measured?
Analyze the Data & Create Visualizations
Now that you have your data, you can begin cleaning and analyzing it. To learn more
about your data, I suggest using R to do any or all of the following:
1. Reformat or transform the data if necessary.
2. Do basic descriptive statistics in R, including: mean(), median(), summary(),
length(), and table () as appropriate for your speci�c dataset and variables
of interest.
Approved tools for creating visualizations:
ˆ R
ˆ World Bank Databank
ˆ Google Ngram
2
ˆ DataWrappe
ˆ Gapminde
Write Up Your Results
Work all of the above into a �nal na
ative that includes your question/theory, the reasons
for which you �nd it interesting, your hypothesis/-es, your data, your analysis, you
visualization(s), and the results. Be succinct. Too often, college students learn to pad
papers in order to reach high page limits. The suggested page length is meant to help
you un-learn that habit and get right to the point.
3
Answered 1 days After Dec 13, 2021

Solution

Mohd answered on Dec 14 2021
116 Votes
Untitled
Untitled
-
12/14/2021
Loading Packages
li
ary(readr)
li
ary(magrittr)
li
ary(dplyr)
li
ary(ggplot2)
li
ary(rmarkdown)
li
ary(skimr)
Framingham Heart study dataset :The dataset is publically available on the Kaggle website, and it is from an ongoing ongoing cardiovascular study on residents of the town of Framingham, Massachusetts. The dataset provides the patients’ information. It includes over 4,000 records and 15 attributes. I slightly modified the data. Data set download Location:
Research questions: 1. Is there association between sex and cu
ent smoking? 2.
framinghamheart <- read_csv("data/framinghamheart.csv")
sum(is.na(framinghamheart))
## [1] 626
Cleaning the data
framinghamheart$male<-replace(framinghamheart$male, framinghamheart$male>1,NA)
framinghamheart$age<-replace(framinghamheart$age,framinghamheart$age>200,NA)
framinghamheart$cu
entSmoke
-replace(framinghamheart$cu
entSmoker,framinghamheart$cu
entSmoke
1,NA)
framinghamheart$BPMeds<-replace(framinghamheart$BPMeds,framinghamheart$BPMeds>1,NA)
framinghamheart$heartRate<-replace(framinghamheart$heartRate,framinghamheart$heartRate==999,NA)
framinghamheart$sysBP<-replace(framinghamheart$sysBP,framinghamheart$sysBP==999,NA)
framinghamheart$diaBP<-replace(framinghamheart$diaBP,framinghamheart$diaBP==999,NA)
framinghamheart$prevalentStroke<-replace(framinghamheart$prevalentStroke,framinghamheart$prevalentStroke==999,NA)
framinghamheart$totChol<-replace(framinghamheart$totChol,framinghamheart$totChol==999,NA)
framinghamheart$glucose<-replace(framinghamheart$glucose,framinghamheart$glucose==999,NA)
Checking for null values
framinghamheart_df<-na.omit(framinghamheart)
summary(framinghamheart_df)
## male age education cu
entSmoker
## Min. :0.0000 Min. :32.00 Min. :1.000 Min. :0.0000
## 1st Qu.:0.0000 1st Qu.:42.00 1st Qu.:1.000 1st Qu.:0.0000
## Median :0.0000 Median :49.00 Median :2.000 Median :0.0000
## Mean :0.4572 Mean :49.71 Mean :1.996 Mean :0.4909 ...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here