Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

EXPENSE CLAIM/REPORT 1 MA5810- CAPSTONE PROJECT Total marks: 100 Due date: Wednesday, Week 7 (27th April), 11:59pm AEST OVERVIEW This assessment involves writing a report that summarises a...

1 answer below »
EXPENSE CLAIM/REPORT
1
MA5810- CAPSTONE PROJECT
Total marks: 100
Due date: Wednesday, Week 7 (27th April), 11:59pm AEST
OVERVIEW
This assessment involves writing a report that summarises a data mining related investigation that
you have conducted on data that you have collected yourself. The investigation must involve the
main topics covered in the subject, most noticeably supervised learning and/or unsupervised
learning using R/RStudio. The assessment builds upon the practical knowledge that you should
have acquired through the previous two assignments, however neither the dataset nor the detailed
steps to be ca
ied out will be provided here, you have to make independent choices and decisions.
Submission
You will need to submit the following:
• A PDF file with R code in Appendix. Please submit everything in one PDF file. The assignment
must be presented in 12 font on A4 pages using single line spacing. The assignment must follow the
equired report structure.
• References should be in APA format.
• R code to reproduce your work
• The task cover sheet.
The assignment should not exceed 12-A4 pages. Appendices do not form part of the page limit.
You have up to three attempts to submit your assessment, and only the last submission will be
marked.
A WORD ON PLAGIARISM AND SELF-PLAGIARISM:
Plagiarism is the act of using another’s words, works or ideas from any source as one’s own.
Plagiarism has no place in a University. Student work containing plagiarised material will be subject
to formal university processes.The assessment builds upon the practical knowledge that you should
have acquired through the previous two assignments, however neither the dataset nor the detailed
steps to be ca
ied out will be provided here, you have to make independent choices and decisions.
In case significant portions of your own previous work (e.g., a report for a related subject you did in
this or any other university) is recycled in a way that it could be fully or partially graded twice
(‘double-dipping’), this is considered self-plagiarism and will not be tolerated.
2
Assessment tasks
In this report, you need to demonstrate that: (a) you have grasped important concepts associated
with this subject, most noticeably supervised and unsupervised learning; and (b) you can
communicate your investigation in a formal written manner.
Regarding (a), we expect that your investigation will include at least three machine learning
algorithms from the following topics:
1. LDA, QDA and/or Naive Bayes classification
2. Logistic Regression classifiers and/or KNN for classification
egression
3. Principal Component Analysis (PCA)
4. Cluster Analysis
5. Association Rule Mining and Recommender Systems
Data
You can
ing your own data or use a dataset provided on the capstone data folder. Your
dataset cannot be smaller than 1000 observations of five variables, except if the targeted data
mining problem to be addressed relates to spatial- temporal data, in which case less than five
dimensions could be allowed.
Preferably, you should use a dataset relevant to your place of work. Do not use data from
textbooks or from R packages. Do not use the same data that have been used in the subject (e.g.
UCI repository). Do not use data for which data mining results and analyses can be found online.
You can use public data, but the data should be appropriate for addressing a relevant data mining
problem, and a solution to a similar problem for the same data should not be available.
Report structure
Please adhere to the strict report structure format. The report will not be assessed if it is not
formatted appropriately. The following sections are needed in your report.
• Title: Should be informative, concise and an accurate representation of your analyses.
• Abstract: The abstract provides a short sharp overview of the contents in the report and will
e around 200
– 300 words. The abstract has five parts:
i. Introductory statement: background to the study, important issue(s) the report
addresses. (approximately 1-2 sentences)
ii. Purpose of the report: state the objectives (1-2 sentences)
iii. Methodological approach: overview the data and methods (2-3 sentences)
iv. Findings or Achievements: list one or two of the main findings or achievements
from your investigation (1-2 sentences)
3

v. Conclusions and Implications: what conclusions can be drawn from your
investigation? How can the findings/achievements in your report deliver a benefit
to people, things, systems or processes? (1-2 sentences).
• Introduction: The introduction sets the scene for the investigative efforts. It provides
motivation for the work and relevant background information and references that will enable
the reader to put in context the key objectives and achievements in your report. Address the
important issues that have motivated your investigation. At the end of the introduction clearly
state the objectives of the report. Do not put any results from your investigation in the
introduction. Do not discuss details about the data and methods in this section. Do not discuss
your conclusions or key findings in the introduction.
• Data: This section should provide details about how the data was obtained and what the data
epresent. You should include information such as (but not limited to)
i. What the source of the data is
ii. How the data was originally collected (e.g., from an experiment or observational
study)
iii. The sample size
iv. The number and types of variables
v. Any known interventions or pre-processing that precede the ones described in your
eport
vi. Any other information that is relevant to the understanding and assessment of
your work
eport.
• Methods: This section should discuss in depth the data mining methods that were used to
process and to analyse the data, as well as the software version used to generate the results
and report. To cite R-Studio type RStudio.Version() from the command line. The methods
should be appropriate to ensure that the objectives of the paper are met.
• Results and Discussion: This section presents and discusses the results. The discussion centres
on the outputs from the data mining procedures that you have performed. For example, what
are the main outcomes? Why are they useful and what for? How are they interesting and
why?, and so on. In particular, how do the results align with the goals set in the introduction?
What are the main achievements and their implications?
• Conclusions: Final remarks about the key achievements of the investigations and what makes
them ‘interesting’ or ‘useful’, right now or for future work. Achievements or findings should
e contrasted with the original objectives or hypotheses of the project. Make sure that you
mention any limitations of your work here. Limit the conclusions to no more than two or three
paragraphs.
• References. List the sources your investigation has drawn from. Note that all references
should be refe
ed to in the text.
• Appendices: Add R code and any supporting materials that might be useful to help assess your
work.
4



RUBRIC TEMPLATE
Please adhere to the report structure requirements. The report will not be assessed if it is not formatted appropriately.
Dimension High distinction Pass Fail
R code and
References
10%
Code submitted and attached to Appendix.

Code works co
ectly, meets the specifications,
produces the co
ect results and displays them
co
ectly.


Code is well organised and very easy to follow.
Code always very well commented so the purpose
of each block of code readily understood and
what question part it co
esponds to. Variable
names give the purpose of the variable.

All references have been listed, in the right
format, and refe
ed to in the appropriate
places in the body of the text and listed at the
end of the report. At least 4 references have
een provided.

Code only provided in answer document but looks
co
ect.

Code often exhibits inco
ect behaviour. Significant
details of specification are violated.

The code is readable only by someone who already
knows what it is supposed to be doing. Comments
not sufficient to see what the code is doing.
Significant lack of comments makes it difficult to
understand code.

Some references have been listed and refe
ed to in
the appropriate places in the body of the text and
listed at the end of the report. At least 2 references
have been provided.
Code not submitted

Code not provided in answer document. Code
produces inco
ect results, does not compile,
or significant e
ors occur.

Code is poorly organised and very difficult to
ead. Code has no comments.

No references.
Abstract and
Introduction
(10%)
Clearly addresses the five parts of the
abstract so that the reader has a clear
overview of the reports.

Position and exceptions, if any, are
clearly stated. Organisation of the
argument is completely and clearly outlined and
implemented.
Partially addresses the five parts of the abstract
and or addresses all five parts but
the writing is not clear in places.

Position is clearly stated. Organisation of
argument is clear in parts or only partially
described and mostly implemented.
Does not provide an overview the report, or
the writing is poor overall and mostly unclear.

Position is vague. Organisation of argument is
missing, vague or not consistently
maintained.
5

Data
(10%)
Data are suitable, the report explains how
the data were obtained.

Provides a detailed, accurate description of
the data and data methods to be employed
within the project.


Exploratory data analysis and verification are
detailed and provides critical insight with
clear overt links to model developments.
Data insights are concisely presented and
visualised.

Data are suitable, the report explains how the
data were obtained.

Provides adequate description of the data and
data methods to be employed within the project.
Some elements of the method are infe
ed or
partially detailed.

Exploratory data analysis and verification are
elementary and inferentially linked to some
elements of model development. Data insights
are presented and visualised with some
edundancy or inference.

Little information and/or explanation
about the data is provided and/or the
grammar structure is difficult to follow
and/or the data do not meet the minimum
equirements.
Methods
(25%)
Lists all the steps in order in which they were
performed to explore, analyse and mine
patterns from the data. At least 3 machine
learning algorithms covered in the subject have
een explored and explained in depth.
Most of the steps are listed and explained, but
some details are a little hazy or questionable. At
least 2 of the targeted key topics from the
subject (listed in the Sophisticated column) have
een reasonably explored and explained.
The methods clearly will not allow the
objectives of the report to be met and/or
the details of methodological steps and
procedures are very difficult to follow
and/or the listed key topics from the subject
have been poorly or not appropriately
explored.
6

Results and
Discussion
(30%)
The results and discussion are explained
co
ectly, clearly, and in sufficient detail. The
esults and discussion clearly follow from the
data collection and the methods.
The results and discussion are explained
co
ectly, clearly and in sufficient detail most of
the time. There exists a connection of some type
etween the results/discussion and the data
collection and methods.
The results and discussion are not
explained co
ectly, clearly and in
sufficient detail. The connection of
some type between the
esults/discussion and the data
collection and methods is missing.


Conclusion
(15%)
The original objectives and/or hypotheses
are restated and contrasted against the
obtained achievements and findings.
The conclusion summarises and draws a
clear, effective conclusion of the
investigation and enhances the impact of the
eport – e.g., it provides a recommendation
or action that should be undertaken in the
future. Discuss unavoidable limitations of the
investigation and suggestions.
Conclusion is clearly stated and connections
to the original objectives and/or hypotheses
are mostly clear.
Conclusion may not be clear and/or the
connections to the work reported are
inco
ect or unclear or just a repetition of
the findings without a suitable
summarisation and interpretation and/or
the underlying logic has major flaws.
7
    OVERVIEW
    A WORD ON PLAGIARISM AND SELF-PLAGIARISM:
    RUBRIC TEMPLATE
Answered 12 days After Nov 15, 2022

Solution

Mukesh answered on Nov 28 2022
42 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here