Microsoft Word - BUS XXXXXXXXXXAss2 Description
BUS708 Statistics and Data Analysis
Inferential Statistics Report
Assignment 2 (Assessment 4) – Individual Word Report – Trimester 3, 2019
1 OVERVIEW OF THE ASSIGNMENT
This assignment will test your skill to present and summarise data as well as to make basic statistical
inferences in a business context. You will use the results and any feedback given in Assignment 1
(Assessment 3, Excel Report) and produce a single report in a word document. You will need to
construct interval estimates, perform suitable hypothesis tests and regression analysis and make
conclusion and suggestion for management action.
Your report should be written in a word document and should be submitted to Turnitin following the
equirement explained below.
2 TASK DESCRIPTION
There are two datasets involved in this assignment: Dataset 1 and Dataset 2, which are the same
datasets used in Assignment 1 (Excel Report). All data processing should be performed in Excel or
Statkey (http:
www.lock5stat.com/StatKey). Specific instruction as to which tools should be used
for each section will be given during tutorials.
Your tasks are to answer the following research questions given in Section 2 to Section 6 below using
dataset 1 or dataset 2 as indicated in each section. To answer each question, you will need to first
present the relevant numerical summary (summary statistics) and graphical display and perform
suitable statistical analysis to provide a statistical conclusion.
1. Section 1: Introduction
Provide a
ief and clear introduction about the report (e.g. the objective of the
eport, the datasets involved, etc.). Find relevant articles (minimum one article,
maximum 3 articles) and write a proper literature review which includes in-text
citation.
2. Section 2: Is 40% a plausible value for the proportion of private room in Ai
nb room type?
Using Dataset 1, first provide both numerical summary as well as graphical display
that easily shows the proportions of different room types. Then construct a 95%
confidence interval of the population proportion of private rooms. Finally, answer the
esearch question using the confidence interval.
3. Section 3: After an iteration of outlier removal, is the price of private room more than
$70?
Using Dataset 1, perform one iteration of outlier detection on the price of private
oom using the method described in the lecture notes. After removing those
outliers, describe the price distribution of private room using both numerical and
graphical summary which shows the remaining outliers, if any.
Perform suitable hypothesis test to answer the research question above at 5% level
of significance.
4. Section 4: Is there a difference in availability in the next 365 days between different room
type?
Using Dataset 1, describe the distribution of availability_365 from each room types.
You need to provide both numerical summary as well as graphical display which
shows the outliers, if any.
Perform a suitable hypothesis test to answer the research question above. Use a 5%
significance level.
5. Section 5: Can we predict the price of the accommodation using the longitude of the
property?
Using Dataset 1, develop a regression model to predict the price of the Ai
nb
accommodation using the longitude of the property. Interpret the co
elation
coefficient, coefficient determination and the relevant p-values and use them to
answer the research question. Provide a suitable graphical display.
6. Section 6: Is there any relationship between gender and room type accommodation?
Using Dataset 2, describe the relationship between a student’s gender and the room
type of the accommodation they cu
ently live in. You need to provide both
numerical summary and graphical display.
Perform a suitable hypothesis test to answer the research question above. Use a 5%
significance level.
7. Section 7: Conclusion
Write a
ief summary of all the findings in the previous sections and write a
concluding statement. Suggest further research by discussing an interesting topic or
esearch question that can be further explored related to the datasets.
3 SUBMISSION REQUIREMENT
Deadline to submit the report: Week 11, Sunday 2 Feb 2020, 23:59
You need to submit a word document file to Turnitin which shows all computer outputs and
discussion. You do not need to submit the dataset.
4 MARKING CRITERIA
Students are advised to read the marking ru
ic provided on Moodle as well as detailed marking
criteria based on this ru
ic.
5 DEDUCTION, LATE SUBMISSION AND EXTENSION
Late submission penalty: - 5% of the total available marks per calendar day unless an extension is
approved. This means 0.75 marks (out of 15 marks) per day.
For extension application procedure, please refer to Section 3.3 of the Subject Outline. Please do
NOT email the lecturer or tutor to seek an extension, you need to follow the procedure described in
the Subject Outline.
6 PLAGIARISM
Please read Section 3.4 Plagiarism and Referencing, from the Subject Outline. Below is part of the
statement:
“Students plagiarising run the risk of severe penalties ranging from a reduction through to 0 marks for a first
offence for a single assessment task, to exclusion from KOI in the most serious repeat cases. Exclusion has
serious visa implications.”
“Authorship is also an issue under Plagiarism – KOI expects students to submit their own original work in both
assessment and exams, or the original work of their group in the case of a group project. All students agree to a
statement of authorship when submitting assessments online via Moodle, stating that the work submitted is
their own original work.
The following are examples of academic misconduct and can attract severe penalties:
Handing in work created by someone else (without acknowledgement) , whether copied from another
student, written by someone else, or from any published or electronic source, is fraud, and falls under
the general Plagiarism guidelines.
Students who willingly allow another student to copy their work in any assessment may be considered
to assisting in copying/cheating, and similar penalties may be applied. ”
write your title here (e.g. Google play apps analysis)
BUS708 Assignment 2
Section 1: Introduction
This document serves as a sample template for Assignment 2, as well as a general feedback for Assignment 1. You don’t have to use this template for Assignment 2, but if you prefer, you can edit this document and use it for Assignment 2. You can change the title, subtitle and section title accordingly.
Some general feedback for Assignment 1
· Some students gave a very short description about dataset 1 and failed to explain what it is about (e.g. some characteristics of Google Play Apps) and/or the source of the dataset (e.g. from Kaggle and originally provided by Lavanya Gupta).
· Quantitative variables in dataset 1 are: Rating, Review and Price. Install can arguably be either quantitative or categorical (original dataset should be categorical but can be accepted as quantitative as it’s quite ambiguous). Size, Last Updated, Cu
ent Version and Android Version are all categorical (Size can potentially be quantitative if the units are all the same).
· The main reason that dataset 2 might be biased is not because it does not cover the whole population (a random sample does not include the whole population, but it’s not biased). Most likely that dataset 2 is biased is because it’s not a representative of the population (only from KOI or other institutions).
· Many students wrote reasonable comments, but many failed to answer the research question. You should have a concluding statement that answer the research question (e.g. “… hence, we conclude that most google play apps are free”, or “there seems to be a difference in prices among paid apps from the categories…”, or “the co
elation coefficient indicates there is no linear relationship between Rating and Review”, etc.)
· Many graphs are still missing a title and axis labels.
Hints for Assignment 2
· Make sure you mention the objective of the report or what is the report about, including short description of the datasets. This can be one paragraph in Section 1.
· Write a proper literature review, including in text citation. Some example can be found in http:
anglia.libguides.com/ld.php?content_id= XXXXXXXXXXParaphrase the article, don’t just copy paste its content. This can be another paragraph in Section 1.
· Make sure you explicitly answer the research question in each section.
· Check that your graphs are complete (title, labels or legends).
· Check and re-check marking criteria to make sure you address all the criteria.
Section 2: Are most google play apps free?
In this section …
data presentation
Inferential statistics
The following ….
Sample size (n) = 4000
Sample proportion (phat) = 0.934
Standard E
or (SE) = = 0.0039
Critical value = 1.96
95% Confidence Interval = 0.934 +/ XXXXXXXXXX)
= (... , ….)
…..
Section 3:
Dfadfa
data presentation
Title of the Boxplot
Inferential statistics
Dfadfad
Sample size (n) = 230
Sample mean (Xbar) = 3.075
Sample standard deviation (s) = 1.783
Test-statistic
p-value = XXXXXXXXXXFrom Statkey, Theoretical Distributions: t, with df = n-1 = 229
Write your conclusion here
Section 4:
data presentation
Copy and paste your numerical summary and graph from Excel to this space. Make sure you have checked if they are co
ect.
Inferential statistics
You need to do step-by-step ANOVA and copy and paste the ANOVA table from Statkey.
Section 5:
data presentation
You need to perform regression analysis and paste in some output in here. You can either use Excel (Data > Data Analysis > Regression) or Statkey. Note that most likely you will need to make a new scatter plot as the order of X and Y may be different.
Inferential statistics
Please refer to the marking criteria to see what inferences you need to make. Also make sure you make a conclusion that answer the research question.
Section 6: