Assignment Details:
· This is an individual Assignment where you need to analyze your own data and write a formal report of the analysis results.
· Start by selecting a relatively large data set to analyze. There should be at least 10 variables in the data set and more than 50 observations. Have at least two “objectives” that you’d like to answer via data analysis. State these objectives clearly in the beginning of your report.
· Use the analysis techniques we are learning this semester to analyze the data and answer your objectives. This data may come from your work/internship or from an internet source that is of interest to you. If you’re struggling to find data, check out https:
www.kaggle.com/competitionsLinks to an external site. for freely available data sets.
· Use the dataset (JMP) and your final project report.
· The report should be 5-10 pages in length – not too picky about length as long as you’ve done a thorough analysis and described your project. There should be nothing missing!
· Include relevant output/graphs in body of paper, but extra output in appendix. (The report should not be ALL output and figures. You need to write formal paragraphs explaining your thought process and analysis results. The reader should not need to know anything about JMP output to understand the conclusions.)
· Add appendix and source of data.
“Analysis of District Level Standardized Test Performance in Pennsylvania Public Schools”
Introduction and Objectives
A large part of the discourse su
ounding the 2014 Pennsylvania gubernatorial election campaign focused on the educational spending policy of past and future administrations. Inherent in that path of discussion is the idea the volume of money spent on education expenses is an influence on performance of the state’s educational institutions. Further, it implies that it is the most important influence on educational performance. However, a measured analysis of what factors drive good school performance was absent from the forefront of this discussion.
A deep field of research exists on what decisions and school characteristics do or do not make for better academic performance, including some analysis of Pennsylvania-specific data. Past studies have concluded that a student’s socioeconomic status has a strong influence on their academic performance (Dahl & Lochner, XXXXXXXXXXOthers suggest that financial incentives for teachers do not improve student performance (Fryer, 2011) or that smaller class size for lower grade levels can improve test scores (Betts, Zau & Rice, XXXXXXXXXXThis project will examine selected characteristics of Pennsylvania school districts to determine their effect on standardized test scores.
Data
All data for this study is from publicly available sources. PSSA scores, average daily membership, student ethnicity data, teacher compensation, attendance rate, district staffing levels, market value and district expense amounts for the school year 2012 were taken from the Pennsylvania Department of Education website. Adult education level, ma
iage rate and median family income by school district are from the National Center for Education Statistics (NCES). Where available, the most recent available school year’s measurements were used. PSSA scores are for Math, Reading and Science for 11th and 4th grades from the XXXXXXXXXXschool year. District staffing and expense information were from the XXXXXXXXXXschool year and due to a change in state reporting methods, the most recent attendance information available was from XXXXXXXXXXOnly public schools were included in this analysis because consistent reporting was not available for many charter schools. Three school districts were excluded because they did not have scores available for all of the subjects and grade levels that were analyzed.
Some factors in the analysis were calculated based on a combination of data fields. Any factor that references a rate based on number of students is based on the average daily membership number for the district. Proficiency rates were calculated by combining the proficient and advanced scoring groups in the PSSA data. The values for average years of adult education were calculated based on groupings in NCES data.
Analysis and Methodology
Using JMP, stepwise multiple linear regression was performed on the data set for each subject (math, reading, science) and grade level (11th, 4th). Independent variables were removed from the model if they did not meet at least a 95% significance level (p-value of XXXXXXXXXXSee Appendix A for detailed output for each model.
Each model was check for the assumptions of multiple linear regression. See Apendix B for selected assumptions testing output. Residuals were tested for normality using the Shapiro-Wilk goodness of fit test. All sets of residuals failed this test and were shown to not be normally distributed. This failure might lead to attempts at non-parametric regression or other analysis methods, but that technique is outside the scope of techniques available for this project. It can be argued that the assumption of normal residuals is not always a critical assumption of multiple linear regression when evaluating the relationship between the independent variable and different factors. “Technically, the normal distribution assumption is not necessary if you are willing to assume the model equation is co
ect and your only goal is to estimate its coefficients and generate predictions in such a way as to minimize mean squared e
or.” (Nau, XXXXXXXXXXAnd in a 2013 article, Williams, Grajales & Kurkiewicz reiterate this point in saying that “the assumption of normally distributed e
ors is not required for multiple regression to provide regression coefficients that are unbiased and consistent, presuming that other assumptions are met. Further, as the sample size grows larger, inferences about coefficients will usually become more and more trustworthy.”
Residual by predicted plots were reviewed to show fairly constant variance across all values in all instance with no clear increasing or decreasing trend. To rule out the possibility of non-constant variance, a natural log transformation of the test scores was done and the models re-run. Virtually no difference in the residual by predicted plot shape was seen. Scatter plots for each combination of independent variable and dependent variable were reviewed to verify the presence of something resembling a linear relationship. Most scatterplots reveal a general linear relationship. Some were rather scattered or had a slight bend to them. Improving the linear relationships between the independent and dependent variables with transformations might be a way to improve the accuracy of these models, but was left out of scope for the purposes of this project.
Data points were reviewed for outliers. High leverage points were defined as data points with a hat matrix value that exceeded 2k/n, where k is the number of terms in the model and n is the sample size. In each model, a number of high leverage points were found, but none could be demonstrated to be inaccurate or not representative of the population measured, so they remained as part of the populated analyzed.
Variance inflation factors (VIF) for each of the six models was also reviewed to test for possible multicollinearity between predictor variables. A maximum VIF threshold of 5.0 was used. The only instance of a high VIF factor in all of the models was the “percentage of students from low income families” variable in the 4th grade math model. That variable has a co
elation of 0.79 with median family income because they are both measures of income in the school district. The percentage of students from low income families is a more precise measurement of student socioeconomic status because it measures only the subset of the school district population that has students in school whereas the median family income includes families that do not have school age children. For that reason, the median family income factor was deleted from the model thus reducing the VIF for “percentage of students from low income families” below the threshold.
A chart of the six models that came from this analysis and their applicable significant factors is below.
The core model equations are as follows:
% of students proficient or better in 11th Grade Math
% of students proficient or better in 4th Grade Math
% of students proficient or better in 11th Grade Reading
% of students proficient or better in 4th Grade Reading
% of students proficient or better in 11th Grade Science
% of students proficient or better in 4th Grade Science
In addition we did for some exploratory analysis to see if a more accurate model could be derived by deleting outliers from the data set. These models proved to be have better fits, but carving out a large portion of data without any evidence of inaccurate collection methods or an underlying reason for why they are outliers does not make sense in social science applications. The presence of so many outliers is an indication of the need for additional data collection and research in to the dynamics of the situation.
Model Adequacy, Shortcomings, Additional Collection and Analysis
The adjusted R-squared values for our models ranged from 0.51 to 0.68, so there is room for additional data collection to explain more of the variation in each response variable.
In addition, ethnic and racial influence on learning has been shown to not be a result of a group’s capacity to learn but instead they are a result of a combination of cultural and economic factors. Collecting more descriptive data for students in each district related to those factors could improve the model. This model does not consider qualitative classroom decisions as a reason for variation in testing outcomes. Data for each district related to teaching methods used, class scheduling and other administrative decisions could be collected an added to the analysis. Additionally, a data research request could be submitted to the Pennsylvania Department of Education to gain access to more granular data including de-identified teacher and student data. Another opportunity for further study is to examine how test scores have changed in relation to the changes in each of the predictor factors over time. That type of analysis might allow for some explanation of how both funding and district administrative decisions affect testing performance.
Additional analysis that could be conducted would be to predict the scores for the XXXXXXXXXXschool year using the models and measure the accuracy. However, data for that school year was not available. Beginning in 2012 Pennsylvania began a slow multi-year transition from the longstanding PSSA tests to Keystone exams that have different standards that align with federal Common Core Standards. PSSA scores for all districts at a granular detail in a single data file stopped being made available in 2012 after seventeen years of consistent public reporting. This change in transparency in an election year amidst declining aggregate state scores drew criticism from both state legislatures and academic policy experts.
Discussion
Despite being the focus of recent election debates, district spending levels in Pennsylvania were not identified as a significant factor in the variation of PSSA test scores in any subject for grades four and eleven. Percentage of non-white students and percentage of students from a low income family were the two school district characteristics that tested as significant for all subjects and grade levels analyzed. One way to look at this analysis is as a starting point for policy initiatives that would improve test results in low performing schools. Among the variety of effective tactics that research has shown to raise the performance of students from low income and minority backgrounds aree introducing culturally relevant learning material, increased parental support (Museus et al., 2011), and nutrition programs (Holler et al., 2011 and Raush, 2013).
References
G. Dahl & L. Lochner, 2005. The Impact of Family Income on Child Achievement, Institute for Research on Poverty Discussion Paper, http:
www.irp.wisc.edu/publications/dps/pdfs/dp130505.pdf
Roland G. Fryer, 2011. TEACHER INCENTIVES AND STUDENT ACHIEVEMENT:
EVIDENCE FROM NEW YORK CITY PUBLIC SCHOOLS, National Bureau of Economic Research Working Papers, http:
www.nber.org/papers/w16850.pdf
J. Betts, A. Zau & L. Rice, 2003. Determinants of Student Achievement: New Evidence from San Diego, The Public Policy Institute of California, http:
epsl.asu.edu/epru/articles/EPRU XXXXXXXXXXOWI.pdf
Robert Nau, 2014, Regression Diagnostics: Testing the Assumptions of Linear Regression, http:
people.duke.edu/~rnau/testing.htm
Williams, Grajales & Kurkiewicz, Sept XXXXXXXXXXAssumptions of Multiple Regression: Co
ecting Two Misconceptions, Practical Research Assessment & Evaluation, Volume 18, Num.ber 11
Dale Mezzacappa, Kevin McCo
y, and Paul Socolar, 2014 October 30. Election near, but still no 2014 Pa. test scores; 2013 results showed a downward trend, http:
thenotebook.org
log/147882/election-near-still-no-pa-test-results-2013-scores-show-downward-trend
Samuel Museus, Robert T. Palmer, Ryan J. Davis & Dina C. Maramba, 2011. Racial and ethnic minority students' success in STEM education, Hoboken: New Jersey: Jossey-Bass, http:
works.bepress.com
obert_palme
32
Danielle Hollar, Michelle Lombardo, Ga
iella Lopez-Mitnik, Theodore L. Hollar, Marie Almon, Arthur S. Agatston, Sarah E. Messiah, May 2010. “Effective Multi-level, Multi-sector, School-based Obesity Prevention Programming Improves Weight, Blood Pressure, and Academic Performance, Especially among Low-Income, Minority Children, Journal of Health Care for the Poor and Underserved
Volume 21, Number 2, pp XXXXXXXXXX, http:
muse.jhu.edu/journals/hpu/summary/v021/21.2A.hollar.html
Rita Rausch, 2013, Nutrition and Academic Performance in School-Age Children The Relation to Obesity and Food Insufficiency. Journal of Nutrition and Food Sciences Volume 3, page 190
APPENDICES
Appendix A – Model Results
JMP Data table file (with scripts for core model) is attached
A.1: core model results
A.2 - Analysis models created by deleting high leverage data points
The high leverage value is (2k/n) = XXXXXXXXXX
K: means 6 independent variables
n: