STAT200: Written Assignment #1 - Descriptive Statistics Data Analysis Plan - Instructions
Page 1 of 4
STAT200 Introduction to Statistics
Assignment #1: Descriptive Statistics Data Analysis Plan
Assignment #1: Prepare Descriptive Statistics Data Analysis Plan
Before conducting any statistical analyses, researchers develop a plan for how they will analyze their
data to answer their research questions. The purpose of this assignment is to provide an experience
developing a descriptive statistics analysis plan. Note: This first assignment is a plan only; no statistics
will be calculated or graphs created. The second assignment will involve ca
ying out the plan, after
eceiving feedback from your instructor.
Assignment Steps:
Step #1: Review the STAT200 data set file. (Note: This data set will be used for all three of this term’s
written assignments).
The data is a subsample from the US Department of Labor’s Consumer Expenditure Surveys (CE) and
provides information about the composition of households and their annual expenditures
(https:
www.bls.gov/cex/). Detailed information on the sample and variables is included with the data
set file; please carefully review this information to familiarize yourself with the data (Note: This
information will be used in Assignment #2 to describe the dataset).
Step #2: Develop descriptive statistics data analysis plan.
➢ Task 1: Develop scenario. Imagine that you are head of a household and have to determine a
household budget plan based on the data available from the dataset. For instance, you are a 35
year old single parent with a high school diploma and one child.
➢ Task 2: Select variables for analysis that match the scenario developed in Task 1.The data set
provides information on household consumption; there are socioeconomic variables and
expenditures variables. The socioeconomic variable names start with “SE-” and the expenditure
variable names start with a “USD;” all expenditures are in US dollars. All students must use
income as one variable. Select two additional socioeconomic variables (one qualitative and one
quantitative) and two expenditures for your analysis that match the scenario you developed for
Task 1. For instance, using the example scenario of a 35 year old single parent with a high
https:
www.bls.gov/cex
https:
www.bls.gov/cex
https:
www.bls.gov/cex
https:
www.bls.gov/cex
https:
www.bls.gov/cex
https:
www.bls.gov/cex
https:
www.bls.gov/cex
https:
www.bls.gov/cex
https:
www.bls.gov/cex
https:
www.bls.gov/cex
STAT200: Written Assignment #1 - Descriptive Statistics Data Analysis Plan - Instructions
Page 2 of 4
school diploma and one child, you could select “income,” “education,” and “number of children”
as socioeconomic variables and then pick two household expenditure items to show the
distribution of costs and compare that with your income. When selecting variables, think about
the following three questions:
o Why am I choosing these variables?
o What interests me about these variables?
o What do I think will be the outcome?
➢ Task 3: Determine appropriate measures of central tendency and dispersion for the selected
variables. For each quantitative variable, select at least one measure of central tendency and at
least one measure of dispersion (Please see below table for list of measures). For the qualitative
variable, select one measure of central tendency. When determining the measures of central
tendency and dispersion, think about what is appropriate given the level of measurement and
type of variable. Recommend refe
ing to the text and information posted in our LEO classroom
to help with this task (Note: you will use this information to provide a rationale for your choice
of measures).
Measures of Central Tendency Measures of Dispersion
● Mean
● Mode
● Median
● Range
● Sample Standard Deviation
● Variance
➢ Task 4: Determine appropriate graph and/or table for each of the selected variables. Select
one graph or table for each variable (Please see below table for list of graphs and tables). When
determining the graphs and tables, think about what is appropriate given the level of
measurement and type of variable. Recommend refe
ing to the text and information posted in
our LEO classroom to help with this task (Note: you will use this information to provide a
ationale for your choice of graphs and/or tables).
Types of Graphs Types of Tables
● Pie Chart
● Bar Chart
● Histogram
● Box Plots (also known as Box-and-Whiskers Plot)
● Frequency Table
● Relative Frequency Table
● Grouped Frequency Table
STAT200: Written Assignment #1 - Descriptive Statistics Data Analysis Plan - Instructions
Page 3 of 4
Step #3: Complete the “Assignment #1: Descriptive Statistics Data Analysis Plan Template.”
Remember, you will not be conducting any statistical analysis, drawing any graphs, or compiling any
tables for the first assignment. Rather, you need to wait for feedback from your instructor on this
assignment and use that feedback to complete Assignment #2.
Here are the main sections for this assignment (i.e., completing the plan template):
✓ Identifying Information. Fill in information on name, class, instructor, and date.
✓ Scenario. In this section,
iefly (2-3 sentences) describe the scenario you developed in Step #2,
Task 1.
✓ Complete Table 1: Variables Selected for the Analysis. Enter information the variables selected
for analysis in Step #2, Task 2. For each selected variable be sure to include its: name as listed in
the data set, description, and variable type.
✓ Reason(s) for Selecting the Variables and Expected Outcome(s): In this section, for each
selected variable, please answer the following questions:
✓ Why did I choose this variable?
✓ What interests me about this variable?
✓ What do I think will be the outcome?
✓ Complete Table 2. Numerical Summaries of the Selected Variables. Enter information on
selected measures of central tendency and dispersion for each selected variable. Be sure to
iefly explain why you choose those measurements. Note: The information for the required
variable, “Income,” has already been completed and can be used as a guide for completing
information on the remaining variables.
✓ Complete Table 3. Type of Graphs and/or Tables for Selected Variables. Enter information on
selected graph and/or table for each selected variable. Be sure to
iefly explain why you
choose those measurements. Note: The information for the required variable, “Income,” has
already been completed and can be used as a guide for completing information on the
emaining variables.
Assignment Submission: Name the file that contains your completed “Assignment #1: Descriptive
Statistics Data Analysis Plan Template” using the following format: “Assignment1-StudentLastName.”
STAT200: Written Assignment #1 - Descriptive Statistics Data Analysis Plan - Instructions
Page 4 of 4
Then, submit the file via the Assignments area in the LEO classroom in the “Assignment #1: Descriptive
Statistics Data Analysis Plan” folder and wait for your instructor’s feedback.
Grading Ru
ic for Written Assignment #1
Scenario and Selection of Related Variables
● Clear description of scenario
● Selected variables and reasons are appropriate for the scenario.
20%
Selection of Measures of Central Tendency and Dispersion
For each variable:
● Appropriate measures selected.
● Rationale is provided and appropriate.
30%
Selection of Graphs and/or Tables
For each variable:
● Appropriate measures selected.
● Rationale is provided and appropriate.
30%
Writing Quality:
Completes all sections of template.
Writes clearly, concisely, and with few e
ors.
20%
STAT200 Introduction to Statistics
Dataset for Written Assignments
Description of Dataset:
The data is a random sample from the US Department of Labor’s 2016 Consumer Expenditure Surveys (CE) and provides information about the composition of households and their annual expenditures (https:
www.bls.gov/cex/). It contains information from 30 households, where a survey responder provided the requested information; it is all self-reported information. This dataset contains four socioeconomic variables (whose names start with SE) and four expenditure variables (whose names start with USD).
Description of Variables/Data Dictionary:
The following table is a data dictionary that describes the variables and their locations in this dataset (Note: Dataset is on second page of this document):
Variable Name
Location in Dataset
Variable Description
Coding
UniqueID#
First Column
Unique number used to identify each survey responde
Each responder has a unique number from 1-30
SE-MaritalStatus
Second Column
Marital Status of Head of Household
Not Ma
ied/Ma
ied
SE-Income
Third Column
Annual Household Income
Amount in US Dollars
SE-AgeHeadHousehold
Fourth Column
Age of the Head of Household
Age in Years
SE-FamilySize
Fifth Column
Total Number of People in Family (Both Adults and Children)
Number of People in Family
USD-Annual Expenditures
Sixth Column
Total Amount of Annual Expenditures
Amount in US Dollars
USD-Food
Seventh Column
Total Amount of Annual Expenditure on Food
Amount in US Dollars
USD-Entertainment
Eighth Column
Total Amount of Annual Expenditure on Entertainment
Amount in US Dollars
USD-Education
Ninth Column
Total Amount of Annual Expenditure on Education
Amount in US Dollars
How to read the data set: Each row contains information from one household. For instance, the first row of the dataset starting on the next page shows us that: the head of household is not ma
ied and is 52 years old, has an annual household income of $95,744, a family size of 4, annual expenditures of $55,963, and spends $7,040 on food, $105 on entertainment, and $340 on education.
UniqueID#
SE-MaritalStatus
SE-Income
SE-AgeHeadHousehold
SE-FamilySize
USD-AnnualExpenditures
USD-Food
USD-Entertainment
USD-Education
1
Not Ma
ied
95744
52
4
55963
7040
105
340
2
Not Ma
ied
95432
51
1
55120
7089
84
274
3
Not Ma
ied
96727
39
2
56440
7051
93
222
4
Not Ma
ied
96621
54
2
55746
7000
106
322
5
Not Ma
ied
96572
59
2
56515
7179
95
349
6
Not Ma
ied
98717
40
3
56393
7036
106
213
7
Not Ma
ied
96697
49
2
56453
6971
86
186
8
Not Ma
ied
96653
51
4
56488
6943
90
212
9
Not Ma
ied
97912
49
1
55704
6937
97
277
10
Not Ma
ied
96928
43
3
55932
6953
105
273
11
Not Ma
ied
96244
56
4
56051
7073
114
261
12
Not Ma
ied
97681
53
4
56124
7097
108
263
13
Not Ma
ied
96522
43
4
56152
6991
101
237
14
Not Ma
ied
95366
48
2
57082
7130
90
305
15
Not Ma
ied
94867
60
1
55512
6935
87
272
16
Ma
ied
108781
52
5
83231
11795
178
689
17
Ma
ied
95706
52
4
71597
8925
121
475
18
Ma
ied
106627
56
3
82676
10363
179
794
19
Ma
ied
97303
27
4
76134
8634
22
16
20
Ma
ied
97663
51
3
72971
9294
108
454
21
Ma
ied
100947
35
4
73973
8455
96
2
22
Ma
ied
100837
42
5
73849
8633
209
53
23
Ma
ied
93901
43
2
74254
9157
132
468
24
Ma
ied
100964
28
2
77744
9397
174
11
25
Ma
ied
95385
50
4
74110
9101
119
454
26
Ma
ied
95994
55
4