This project gives you the chance to use the statistical techniques and concepts covered in Chapters...

Question

This project gives you the chance to use the statistical techniques and concepts covered in Chapters 7-9, and Chapter 14. Overview: The U.S. Department of Agriculture maintains a nutrient database for 100 nutrients of over 7300 foods. As part of this assignment, there is an Excel file in Canvas of cereal data from the 2004 USDA National Nutrient Database for Standard Reference, Release 17. * The dataset contains the fat, calorie, carbohydrate, fiber, and sugar data for all 278 cereal items in the nutrient database (at that time) that contain standard serving size information. From this data, you will ESTIMATE a population mean and determine an appropriate sample size. With the help of Excel, you can then take a random sample, test a hypothesis, calculate a confidence interval, and compare the results with the actual population data. You will also investigate the correlation between fat and calories. Data Collection: In the files library in Canvas and attached to the assignment writeup is the Excel file for this project. It is called “Copy of cereal_usda_data_stat_proj2.xls”. Although the file can be considered a population, you MUST NOT use it in its entirety until part 4! POPULATION DATA IS ONLY TO BE USED FOR PART 4. Optional Data Analysis: You may choose to analyze data of your choice under the following conditions:  You must pre-approve your data selection with the instructor.  The population should contain between XXXXXXXXXXobservations.  Data for at least two different variables must be available for correlation analysis.  The data must be accessible to the instructor electronically (from any data source, including data entry by yourself). Project Requirements  Project Elements: There are 5 parts to the project and they must be done in order! Label each part as you would a multiple part homework problem, example 1a) 1. Take a guess a. Begin with a visual review of data in the fat, calorie, carbohydrate, fiber, or sugar columns. Estimate the population mean for ONE of these variables. Do this without using a calculator or any Excel formula or function. You may only scan the date file visually in this step. State the variable name, your estimated mean, and describe how you arrived at it. * Reference: U.S. Department of Agriculture, Agricultural Research Service XXXXXXXXXXUSDA National Nutrient Database for Standard Reference, Release 17. Nutrient Data Laboratory Home Page, http://www.nal.usda.gov/fnic/foodcomp GBS221 Bulriss 2. Construct a hypothesis test (and test with an appropriate sample size) Base your hypothesis test on your ESTIMATED population mean in Step 1 above. a. State the hypotheses in mathematical and written terms. b. Specify your chosen level of significance and state why you chose it. c. Determine an appropriate sample size ➔ When you determine sample size, use the method outlined in Section 8.3 of the text. You must decide what Z (confidence level) to use, the allowable error, and how to estimate the standard deviation (refer to page 384 for ways of estimating the standard deviation). Show how you calculated the sample size (show formula), and explain how/why you chose the values you used for Z, e, and σ. d. State which form of the hypothesis test you will use & why. (Z or t, 1or 2 tail) e. Calculate the Critical Value. f. Use Excel to select a random sample. Be sure to document, in detail, your process for selecting the sample. Run Excel’s Descriptive Statistics on the sample. Include your random sample and descriptive statistics as an attachment to the project. g. Calculate your test statistic, p-value, and state your conclusion. 3. Construct a confidence interval a. Estimate the population mean for your variable using your sample data. b. Comment on the estimate for your population mean. c. Construct a confidence interval of your choice and interpret the results. 4. Compute the population mean and standard deviation a. Compute the population mean and standard deviation using Excel’s descriptive statistics and compare them with the mean and standard deviation of your sample that you got in part 2f. Comment on the comparison. b. Compare the population mean with your confidence interval from 3c and comment on the comparison. 5. Use simple linear regression on the random variables of fat and calories a. Specify which is the independent variable and which is the dependent variable. b. Write down what you think the relationship is between the two variables. What do you believe regarding the strength of the correlation? (Is it strong or weak, positive, or negative?) c. Prepare a scatter diagram and least squares trend line, using your sample data. Compute the coefficient of correlation (r) and the coefficient of determination (r2). Refer to instructions in the files library in Canvas titled “Using Excel Chart Tools to Create a Scatter Diagram and Determine Regression Line”. d. Interpret your results and compare them to what you expected.  Format of the Report ➢ Project elements should be in numerical order and identified by number and letter. [example: 1a) The estimated mean fiber content for cereal is 3g.] ➢ The report should be typed using a word processor (single spaced, 12pt font) and calculations, graphs, random number tables, and charts should be prepared using excel

Rajeswari · Accepted Answer

48792 assignment
1. Take a guess
I selected the carbohydrates column and visually went through all the entries and my guess for average carbohydrates is 30.00 .
Variable name – Carbohydrates
Mean I guessed – 30
I guessed this mean because I thought most of the entries are around 30 ranging from roughly 20’s more and 30’s to 40’s more.  SO I guess it would be approximately around 30
2. Construct a hypothesis test (and test with an appropriate sample size) 
Base your hypothesis test on your ESTIMATED population mean in Step I
a) H_0: \bar x = 30
H_a: \bar x ≠30
H_0: Sample mean will be equal to 30 against 
H_a: Sample mean will not be equal to 30
b) Significance level I chose = 5% because I felt 1% is too low and 10% is too high.  So 5% is reasonable for the hypothesis test.
c) I want a margin of error of 2.   Standard deviation I got from the data given. It was equal to 10.412.
Since population standard deviation is calculated we can use Z test and hence critical value to be used is 1.96
Using the fact that margin of error = 1.96* std error  113, hence sample size = 139, error 0.05, our significant value, we accept null hypothesis
There is statistical evidence at 5% significance level to prove that sample mean is 30 i.e. equal to population mean, hypothesized
3) Constructing a confidence interval:
a) Population mean estimate = sample mean = 30.534 g
For our significance level of 5% we get confidence level = 95%
b) Confidence interval = (mean – margin of error, mean +margin of error)
= (mean – 1.96*std error, mean +1.

This project gives you the chance to use the statistical techniques and concepts covered in Chapters 7-9, and Chapter 14. Overview: The U.S. Department of Agriculture maintains a nutrient database for...

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment