Assessment
BUS5PA Predictive Analytics - 2021
BUS5PA Assignment 3
BUS5PA Predictive Analytics – Semester 1, 2021
Assignment 3: Customer Segmentation, Association Rule Mining, and MBA Case
Studies
Release Date: 10th May 2021
Due Date: 4th June XXXXXXXXXXpm
Weight: 30%
Format of Submission: A report (electronic form) + SAS files in .spk format (electronic)
Submission of project in LMS site.
Part A - Cluster Analysis (40%)
The manager of a leading supermarket mall is interested in finding out purchasing behaviors
of customers. Based on this, he wishes to identify different segments of customers in order to
improve the cu
ent target marketing campaign.
The CUSTOMER_DATA dataset contains the basic details about customers obtained via
membership cards. In this dataset each row represents an individual customer. There are five
variables in the dataset.
The variables in the data set are shown below with the appropriate roles and levels.
Name Model
Role
Measurement
Level
Description
CustomerID ID Nominal Identification number of the customer
Gender Input Nominal Gender of the customer
Age Input Interval Age of the customer
Annual_Income Input Interval Annual Income ($)
Spending_Score Input Interval Spending score of the customer based on
the previous purchase records.
The spending score ranges from 1 to 100
You, as the data analyst, is required to conduct a cluster analysis for the data set and provide
an insightful report to the manager of the supermarket mall to understand different customer
ehaviors.
a. Create a new diagram in your project. Name the diagram as profiling.
. Define the data set CUSTOMER_DATA as a data source.
c. Add an Input Data Source node to the diagram workspace and select the
CUSTOMER_DATA data table as the data source.
d. Determine whether the model roles and measurement levels assigned to the variables are
appropriate.
Examine the distribution of the variables.
• Are there any skewed variables?
• If yes, use the Transform variables node to transform the skewed variables.
(Hint: Use the log transformation; LOG(variable_name)
BUS5PA Predictive Analytics - 2021
BUS5PA Assignment 3
e. Add a Cluster node to the diagram workspace and set the number of clusters as four.
f. Set the appropriate properties for the Cluster node.
Leave the default setting as Internal Standardization Standardization
What would happen if inputs were not standardized? Explain using knowledge from
discussions in the class.
g. Run the diagram from the Cluster node and examine the results.
Does the number of clusters created seem reasonable? Discuss using knowledge from
class discussions – what is a cluste
how many clusters should you have, etc.
h. Specify a maximum of six clusters and re-run the Cluster node.
How does the number and quality of clusters compare to that obtained in part e?
i. Use the Segment Profile node to summarize the nature of the clusters. Describe the
profiles based on different customer behaviors.
j. The supermarket manager would like to develop a target marketing strategy based on this
cluster analysis. Prepare a
ief report (max. 1000 words) presenting:
(a) The problem
(b) Your solution/approach
(c) Outcomes
(d) Analysis results and interpretation
With the presentation, discuss how the clustering and store profiles you have ca
ied out
could be used in such a strategy.
Part B - Market Basket Analysis and Association Rules (30%)
In order to plan innovative promotions to move items that are often purchased together, a supermarket
chain is interested in market basket analysis of groceries purchased. You are a member of the analytics
team assigned to the task.
The supermarket chose to conduct a market basket analysis of specific items purchased from the
online TRANSACTIONS data set contains information about more than 38,000 transactions made
over the past three months from 167 different items including:
You have access to SAS Enterprise Miner data analytics tools and decided to ca
y out a market
asket and association rule-based analysis of the data. The following instructions will help you to set
up the SAS diagram for the analysis.
Whole milk soda Tropical fruit Citrus fruit Shopping
ags
Bottled
eer
Other
vegetables
yogurt Bottled water Pip fruit Canned beer Newspapers
Rolls
uns Root
vegetables
sausage pastry Whipped/sour
cream
frankfurter
BUS5PA Predictive Analytics - 2021
BUS5PA Assignment 3
There are three variables in the data set:
Name Model
Role
Measurement
Level
Description
MemberId ID Interval Member identification number
Item Target Nominal Product purchased
Date Rejected Time ID Date of this product purchased
k. Create a new diagram. Name the diagram Retail.
l. Create a new data source using the data set TRANSACTION.
m. Assign the variable Date the model role Rejected. This variable is not used in this analysis.
Assign the ID model role to the variable MemberId and the Target model role to the variable
Item. Change the data source role to Transaction.
n. Add the TRANSACTIONS data set and an Association node to the diagram.
o. Change the setting for the Export Rule by ID property to Yes.
p. Leave the remaining default settings for the Association node and run the analysis.
Examine the results of the association analysis. Your team leader has indicated that the answer to the
following questions will be useful to the management. You have to answer the questions and prepare a
eport giving evidence to support your answers – (e.g.: Screen shots, numeric values etc.).
1. What is the significance of the lift value of a rule? What is lift and what is the importance in
calculating lift?
2. What is the highest lift value for the resulting rules, which rules have this value? What does the
highest lift value signify?
3. Based on the association rules,
iefly describe 3 example product bundles and promotions that
you might suggest?
You are required to provide detailed report of the outcomes of the analysis to your manager.
Prepare a
ief report (max. 1000 words) presenting:
(a) The problem
(b) Your solution/approach
(c) Outcomes
(d) Analysis results and interpretation
You should explain the approach and outcomes such as support, confidence, lift and-, how could
the product bundles you suggested be used (practical value) by the departments.
BUS5PA Predictive Analytics - 2021
BUS5PA Assignment 3
Part C – Open Discussion - Analytics Case Study (30%)
This question is based on the week 11 guest lecture. It is very important that you attend the guest lecture
to be able to answer this question.
You will be provided with a case study related to the guest lecture and additional resources with related
ackground knowledge.
You are expected to summarize the content of guest lecture and discuss how you relate the guest lecture
to the provided case study.
• How would you make use of the knowledge and understanding you gained from the guest
lecture to relate to the case study.?
• Do you think the approach taken in the case study could have been improved or advanced?
You are expected to write a report (max 1000 words) discussing the above points.
BUS5PA_2021_SEM1-Assignment_3_Answer_Guidelines.pdf
BUS5PA Assignment 3 – Answer preparation guidelines (Ru
ic)
A) Expectation Matrix for answer preparation
D C B A
PART A Appropriately
named,
implemented
project and
diagram.
Attempt on
clustering without
further
explanations and
complete
questions A(a) to
A(e) with log
transformation if
necessary.
D requirements +
Co
ect data
standardization as
equired.
Explanation for
questions A(f) to
A(h). Basic attempt
on segmentation
and basic attempt
on the report.
C requirements +
Comprehensive
answer to question
A(i) and good
attempt on the
eport.
B requirement +
Answer to question A(j)
with
ief justification
and recommendation of
the target market
strategy. More focus
should be given to the
interpretation of results
with actionable insights
to the client (manager of
the supermarket).
Detailed report with
strong insights and
analysis.
PART B Appropriately
named,
implemented
project and
diagram.
Answers to
questions B(k) to
B(p)
D requirements +
Co
ect answer to
questions 1 and 2
in the last section
with
ief
explanation in the
eport.
C requirements +
Co
ect answer to
question 3 in the
last section with
ief explanation of
product bundles.
Good structured
attempt in the
eport.
B requirement +
Co
ect answer to report
of the outcomes in the
last section with
ecommendations and
insights. Detailed report
with strong insights and
analysis.
PART C Demonstrated
understanding of
the case study and
appropriate
discussion over the
predictive analytics
life cycle
D requirements +
Discussion of how
each of the life
cycle steps is
epresented in the
case study +
mention of 3 key
points from the
guest lectures that
the student found
useful
C requirements +
Extending the
discussion to
elaborate on how
the 3 points from
the guest lecture
could be used to
further
understand/add
value/elaborate the
discussion.
B requirement +
Demonstration of clear
understanding of the
case study, related
lifecycle steps,
highlighting of guest
lecture points and
expanding the discussion
to Banking (references
from recent
projects/news in the
field).
B) Report format – A formal report is NOT expected for this assessment. As students of Masters
level, you are expected to decide a suitable format for an informal report. Please speak to
lecturer if you are unsure about the format. A word or pdf file with the heading BUS5PA
Assignment 3, and student details should be submitted. There is no strict word or page length,
ut students must keep in mind that a report that is too long can ‘hide’ the important points
from the reader. A suggested page length is 10-12 pages + appendix containing additional
screen shots of results.
C) Submission –