Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

attached file

1 answer below »
Assignment 1
PS 3780 Data Literacy & Visualization, Summer 2022
Due Date: Thursday, May 19, 2022 at 11:59 p.m.
This assignment is designed to test your ability to gather, aggregate, evaluate and describe
data. Additionally, this assignment tests your ability to di�erentiate between di�erent
forms of data visualization. Please save your answer to these questions as one .pdf
�le (use the �save as� function in most word processors). Be sure to include your name,
your teammate's name if there is anyone, the assignment number, and three separate .csv
�les of your data. Submit all �les to Carmen by the due date.
Part I: Web Scraping and Data Cleaning
Learn About Our OSU President. Go to https:
www.opensecrets.org/donor-lookup,
where you can look up federally disclosed political contributions.(4 pt).
1. Search for our new-ish school president, Kristina Johnson. You should �nd approx-
imately 240 results.
2. Use ParseHub (see video lecture 2f) to scrape data from the second and third column
- Contributor and Occupation. Download and save this dataset once ParseHub has
completed its run. {Hint: the key is setting up and choosing the co
ect type of
Next button.}
3. Next use OpenRe�ne (see video lecture 2g) to clean the data so that it makes
more sense: split the Contributor column in two - Name and Address. {Hint: The
Contributor column has Name and Address separated by a line
eak. Use the
egular expression (ie. check the box), `\n' (without the quotes) as the separato
here.}
4. Use Text Facet, as done on the video lecture and in recitation, on the Name column
to combine names that should be the same. Justify your decision to combine or not
combine names in two sentences. What are the top 5 most common names?
5. Use Text Facet on the Occupation column to combine jobs that should be the
same. Justify your decision to combine or not combine occupations in two sentences.
What the the top 5 most common occupations?
6. Save (export) this cleaned dataset as a csv �le named opensecrets.csv. This dataset
should have 3 columns - Name, Address, and Contribution. Submit this �le with
your assignment.
1
https:
www.opensecrets.org/donor-lookup
Part II: World Bank
You have been contacted by an organization that wants to understand the impact of
subnational and international con�ict on the career prospects of women. They want you
to compare educational outcomes for women in high-, middle-, and low-income countries
to those in countries that are experiencing con�ict. Go to the World Bank website
(https:
data.worldbank.org/) and look for the data about the percentage of women
who compete primary education in the four types of situations. Create an appropriate
visualization that helps your comparison, save the screenshot of the graph in the PDF,
and write 1-2 paragraphs to discuss your �ndings and conclusion.
2
https:
data.worldbank.org/
Answered 1 days After May 16, 2022

Solution

Mohd answered on May 18 2022
89 Votes
-
-
-
5/18/2022
li
ary(readr)
li
ary(magrittr)
li
ary(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
li
ary(ggplot2)
li
ary(rmarkdown)
scraped_Data <- read_csv("New folder (3)/scraped Data.csv")
## Rows: 240 Columns: 3
## -- Column specification --------------------------------------------------------
## Delimiter: ","
## chr (3): Name, Address, Occupation
##
## i Use `spec()` to retrieve the full column specification for this data.
## i Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(scraped_Data,10)
## # A ti
le: 10 x 3
## Name Address Occupation
##
## 1 JOHNSON, KRISTINA Princeton, NJ 8540 LAWYER/LEGAL
## 2 JOHNSON, KRISTINA Princeton, NJ 8540 LAWYER/LEGAL
## 3 JOHNSON, KRISTINA HELENA, MT 59602 EMPLOYEE
## 4 JOHNSON, KRISTINA Princeton, NJ 8540 LAWYER/LEGAL
## 5 JOHNSON, KRISTINA...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here