Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Motivation The first step in the statistical process involves asking a research question. Some questions can be answered with a statistical method from Math 361, others require a more advanced...

1 answer below »

Motivation 
The first step in the statistical process involves asking a research question. Some questions can be answered with a statistical method from Math 361, others require a more advanced statistical technique.  Many questions are not suitable for statistical analysis at all, being better suited to philosophy, mathematics, direct experimentation, etc.  
 
Instructions 
Can a method from our “methods” table be used to answer the research questions?  Think about the number and type of variable that would need to be collected to answer the question.  Then see if these match a column of the “methods” table.  If yes, decide whether a statistic or graph would suffice or if you should use a confidence interval or test of significance.  If you choose a test, write the null and alternative hypotheses for the question. Your answers for each of the three questions are worth 8 points, for a total of 24 points.  
Relevant Class Material  
"methods" table, variables, observational units, types of variable 
 
SECTION 1 (24 points) 
Fill in the table.  Some cells may be left empty depending on your answer to the first question.   
Question 
Possible answers 
Do a higher proportion of people live in rural areas now than before Covid?  
What proportion of people in Klamath Falls are mo
idly obese? 
On average, how much do students spend on textbooks a term? 
Can a method from this class be used?  If yes, answer the following questions. If not,
iefly explain how the question should be answered without statistics.  
Yes or no 
 
 
 
What variable(s) need to be collected to answer this question?   
 
 
 
 
What is the type of each variable?  
Numerical or binary categorical 
 
 
 
What observational unit(s) could the variable(s) be collected on? 
 
 
 
 
Assuming a sample is used, state the population 
 
 
 
 
Would a test or a confidence interval be more appropriate for this question? 
Confidence Interval or Test? 
 
 
 
State the name of the most appropriate approximate method from the "methods" table 
t or z? 
One sample or two sample? 
Paired or independent (if two sample)? 
 
 
 
State the name of the type of graph you recommended for the variable(s) 
Boxplot(s),  
arplot, histogram(s), scatterplot, or stacked barplot? 
 
 
 
SECTION 2:
The second step in the statistical process is to make a plan for collecting data, analyzing data and making a conclusion from the data. Careful planning can reduce biases in estimation due to data collection or study design.   
 
In this section you will create a plan to answer the research question, "Does serving in student government during high school lead to higher earnings at age 30 for people in the US?" 
 
There are 11 questions here, each worth 2 points with the exception of question 3. 
 
 
    What is the population in the research question? 
 
 
    What variable(s) must be collected to answer the research question? 
 
 
 
 
 
    (4 points) Frame the research question as null and alternative hypotheses with an appropriate parameter.  Write both hypotheses using appropriate symbols. (In OneNote, you can "insert" a "symbol" to obtain π or µ) 
 
 
 
 
 
    Briefly explain why it is not feasible to collect data via a simple random sample in order to answer the research question.   
 
 
 
 
 
 
    Is it feasible to perform a randomized controlled experiment (RCT) to answer this research question?  Briefly explain your reasoning. 
 
 
 
 
 
     Which of these three possible explanations is most directly addressed by the computation of a p-value for the null and alternative hypotheses of question 3? 
 
 
Choose 1:  causal effect                            chance                             confounding variable 
 
 
 
    Identify a possible confounding variable in this study and
iefly explain how it relates to both obtaining a student government participation and earnings at age 30.  
 
 
 
 
 
    What is a Type I e
or in the context of this research question?   
    Once the data is collected on your response and treatment variables, you will be ready to do the data analysis.   
Sketch or insert a table of how you plan to summarize the dataset you collect (means or medians, standard deviation or MAD…). Include pretend numbers and be sure to label the columns and rows. 
 
 
 
 
 
 
    Once the data is collected on your response and treatment variables, you will be ready to do the data analysis.   
Which inferential method do you plan to use? 
 
 
 
    The last step in the plan is to decide how you form a conclusion based on the data analysis.  The p-value from a test will tell you the probability of seeing a difference as extreme as the difference in your dataset assuming the null hypothesis is true.  Choose one option below and
iefly outline your conclusions under the following scenarios: 
 
I choose option ___ 
 
Option 1: Ignore the potential for confounding bias and choose between "by chance" and "causal effect" via a p-value  as following: 
 
    If the p-value is less than ______, I will conclude that _____________ 
    If the p-value is above _________, I will conclude that ____________ 
 
Option 2: Decide the potential for confounding bias is so extreme that it is not worthwhile to do a test using only the response and treatment variables. In this case,
iefly explain how the method of subclassification could be used to adjust this plan to account for the confounding variable you are wo
ied about: 
SECTION 3:
The "parks.csv" dataset in Canvas dataset contains information on all recorded visits to National Parks.  This (and much more) is available for public download here: STATS - National Reports (nps.gov). The parks.csv dataset has the annual visits in 2018 and 2019 by type of visit.  The types are 
    Recreational visits (RV) 
    Non-recreational visits (NRV) 
    Concessioner Lodging (CL) 
    Concessioner Camping (CC) 
    Tent overnights (TO) 
    RV overnights (RVO) 
    Backcountry overnights (BO) 
    Non-recreational overnights (N) 
    Misc. overnights (MO)
EXCEL FILE ATTACHED FOR THIS SECTION
Here's a blank code notebook if you're using R: 
 
https:
colab.research.google.com/drive/1cFMAOpl1c3HfbhvIDCBxFwitIh495zgt?usp=sharing 
You will likely find the R code helpful
 
 
Use the dataset "parks.csv" to answer the following questions. 
 
    (2 points) How many parks are included in the dataset? 
 
 
 
 
2. (4 points) In 2019, what was largest number of Backcountry overnight visits to a single park? Which park was it? 
 
 
 
 
3. (6 points) Create a subset of the dataset for parks with a non-zero number of Backcountry overnights in 2019. Briefly describe the distribution of Backcountry overnights in 2019 by computing a measure of spread, a measure of center and the number of parks with non-zero Backcountry Overnight visits. 
 
Number: 
Center:
Spread: 
4. (6 points) Using the full dataset, compute the difference between the number of Backcountry overnights in 2018 and 2019 for each park.  Create an appropriate graph of this variable and write a sentence or two describing what you learned about it's distribution. Include a screenshot of your graph and comment on center, shape, spread and any outliers, as appropriate.   
 
 
 
 
 
 
 
 
 
5. (6 points) Review the list of available park names, e.g. by opening "parks.csv" in Excel.  Choose a park and a variable you're interested in and compare the variable's value for your park with the distribution of that variable for all parks.  For example, compute the median and MAD for the variable for all parks and say how your park compares to these values.  Write a sentence describing what you learned. 
Include a printout of your R code or Jamovi screenshot or Excel spreadsheet below: 
Answered 2 days After Mar 14, 2022

Solution

Suraj answered on Mar 16 2022
87 Votes
Section 1:
    Question 
    Possible answers 
    Do a higher proportion of people live in rural areas now than before Covid?  
    What proportion of people in Klamath Falls are mo
idly obese? 
    On average, how much do students spend on textbooks a term? 
    Can a method from this class be used?  If yes, answer the following questions. If not,
iefly explain how the question should be answered without statistics.  
    Yes or no 
     Yes
    Yes 
    Yes 
    What variable(s) need to be collected to answer this question?   
     
     Proportion of people in rural area before and after covid
     Proportion of people in Klamath Falls are mo
idly
    Average time spend on textbooks in a term 
    What is the type of each variable?  
    Numerical or binary categorical 
     Binary categorical
    Binary categorical 
    Numerical 
    What observational unit(s) could the variable(s) be collected on? 
     
     Yes or No
    Yes or No 
    Average time in hours 
    Assuming a sample is used, state the population 
     
    People who live in Rural area
    People of Klamath Falls
     Students of a school
    Would a test or a confidence interval be more appropriate for this question? 
    Confidence Interval or Test? 
     Test
    Test
    Test 
    State the name of the most appropriate approximate method from the "methods" table 
    t or z? 
One sample or two sample? 
Paired or independent (if two sample)? 
     Two sample proportion z test
     One sample proportion z test
    One sample t-test 
    State the name of the type of graph you recommended for the variable(s) 
    Boxplot(s),  
arplot, histogram(s), scatterplot, or stacked barplot? 
     Stacked Bar Plot
    Bar plot 
    Boxplot 
Section 2:
The second step in the statistical process is to make a plan for collecting data, analyzing data and making a conclusion from the data. Careful planning can reduce biases in estimation due to data collection or study design.   
 
In this section you will create a plan to answer the research question, "Does serving in student government high school lead to higher earnings at age 30 for people in the US?" 
 
There are 11 questions here, each worth 2 points with the exception of question 3. 
 
 
1.
What is the population in the research question? 
Solution:
The population for this research question is all people who are serving in the student government high school in US and all the people of age 30 in US.
 
2.
What variable(s) must be collected to answer the research question? 
Solution:
The Variables needed to collect for this research question is the salary of age 30 people and the salary of the people at the student government high school. 
 
3.
(4 points) Frame the research question as null and alternative hypotheses with an appropriate parameter.  Write both hypotheses using appropriate symbols. (In OneNote, you can...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here