The purpose of this project is to determine the "statistical significance" of a difference observed in a sample. Statistical significance can aid one in making a conclusion from a data analysis by allowing one to assess the likelihood of seeing similar data being generated "by chance".
In this project, you will perform a test of the null hypothesis that Black job applicants having a criminal record have the same callback rate as those with records vs. the alternative hypothesis that having a criminal record decreases the likelihood of a callback. Additionally, you will compute a confidence interval for the difference in callback rates.
**EXCEL FILE HAS BEEN ATTACHED**
Use the "methods" table in the Content Li
ary/Class Resources to help decide which statistical methods (descriptive statistics, test and/or confidence interval) to use.
You may share code and discuss the project with other students in the class but your written summary must be unique to you. Please review the sample project V to see example calculations and writeup.
Excel users: please follow the links in the "methods" table to see tutorials on how to perform the calculations. Here is another resource that might be useful: Microsoft Word - two proportion z-test 2016.docx (utexas.edu) and here are sample calculations for white applicants (your calculations should be for Black applicants).
To turn in:
1. (10 points) Print out a copy of your code notebook here or upload an image of your Excel spreadsheet or Jamovi after performing the following steps:
a. Open the dataset in your choice of Excel or R. Note the need for any informative labels for categorical variables.
. Create a subset of the dataset that contains only Black job applicants.
c. Compute appropriate descriptive statistics of the variables callback and criminal record, noting the sample size and if there is missing data, the number of missing values.
d. Perform the appropriate test to answer the research question "Do Black job applicants with criminal records have lower call back rates for job interviews?".
e. Compute an appropriate 95% confidence interval to answer the research question "What is the difference in call back rates for Black job applicants with and without criminal records?".
(15 points) Write a
ief summary of your conclusions. This summary should be in the form of a one page report. Please include:
a. Appropriate descriptive statistics
. Your p-value and the name of the test (t-test? Proportion test? Binomial test?)
c. A "jargon free" interpretation of the p-value
d. A 95% confidence interval for the difference in callback rates for job applicants with or without criminal records. Also give the name of the formula used (t-interval? z-interval?)
e. An interpretation of the confidence interval
f. Your answers to the research questions based on the p-value, the confidence interval and the internal validity and external validity of the study. For external validity, be sure to comment on the population to which you're willing to generalize your conclusion to (which job applicants? Which types of companies, which type of records?). For internal validity, comment on whether or not the study design allows for a causal effect to be estimated (could there be confounding variables? If so, what are they?).
As you write the one page summary with your conclusions, you may find the following details useful:
This exercise is based on:
Pager, Devah. (2003). “The Mark of a Criminal Record.” American Journal of Sociology 108(5):937-975.
To isolate the causal effect of a criminal record for black and white applicants, Pager ran an audit experiment. In this type of experiment, researchers present two similar people that differ only according to one trait thought to be the source of discrimination. This approach was used in the resume experiment described in Chapter 2 of QSS, where researchers randomly assigned stereotypically African-American-sounding names and stereotypically white-sounding names to otherwise identical resumes to measure discrimination in the labor market.
To examine the role of a criminal record, Pager hired a pair of white men and a pair of black men and instructed them to apply for existing entry-level jobs in the city of Milwaukee. The men in each pair were matched on a number of dimensions, including physical appearance and self-presentation. As much as possible, the only difference between the two was that Pager randomly varied which individual in the pair would indicate to potential employers that he had a criminal record. Further, each week, the pair alternated which applicant would present himself as an ex-felon. To determine how incarceration and race influence employment chances, she compared callback rates among applicants with and without a criminal background and calculated how those callback rates varied by race.
In the data you will use (criminalrecord.csv) nearly all these cases are present, but 4 cases have been redacted. As a result, your findings may differ slightly from those in the paper. The names and descriptions of variables are shown below.
You may not need to use all of these variables for this activity. We’ve kept these unnecessary variables in the dataset because it is common to receive a dataset with much more information than you need.
1. jobid: Job ID number
2. callback: 1 if tester received a callback, 0 if the tester did not receive a callback.
3. black: 1 if the tester is black, 0 if the tester is white.
4. crimrec: 1 if the tester has a criminal record, 0 if the tester does not.
5. interact: 1 if tester interacted with employer during the job application, 0 if tester does not interact with employer.
6. city: 1 is job is located in the city center, 0 if job is located in the subu
7. distance: Job’s average distance to downtown.
8. custserv: 1 if job is in the costumer service sector, 0 if it is not.
9. manualskill: 1 if job requires manual skills, 0 if it does not.