Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select...

1 answer below »
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\righta
ow$Restart) and then **run all cells** (in the menubar, select Cell$\\righta
ow$Run All).\n",
"\n",
"Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\". (After you have done that, you can delete the 'raise NotImplementedE
or()' line, and then run your code to check that it works).\n",
"\n",
"Also, enter your NAME in the next cell.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"NAME = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "01648f2c7a2b86d733aea6aeb05487e8",
"grade": false,
"grade_id": "jupyter",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"source": [
"# ICT706 SouthBank 2020 Semester 1 Task 2\n",
"\n",
"This assignment will be done completely inside this Jupyter notebook.\n",
"\n",
"### Background\n",
"A medium-size company has given you one year of data about the online purchases that their customers have made. They want you to analyse the data using statistical and machine learning techniques and produce:\n",
"* a prediction algorithm for predicting how much money each customer is likely to spend in a year;\n",
"* a classification algorithm for predicting which customers will be 'big spenders';\n",
"* some recommendations on what marketing strategy they should use to attract more 'big spender' customers.\n",
"\n",
"### Instructions\n",
"Follow all the instructions in this notebook to complete these tasks. Note that some cells contain 'assert' statements - these will automatically mark your work so that you can check that you have done the preceeding steps co
ectly. (If they give e
ors, then go back and co
ect your previous work until you fix those e
ors. Once those 'assert' cells execute without e
ors, you know that you have achieved the marks for that step.) \n",
"\n",
"When you have finished, this notebook is the only file that you will need to submit to Blackboard.\n",
"\n",
"Note: If you want some space to try out some Python code of your own, feel free to add extra cells into this notebook. Just make sure that before you submit your notebook, that those extra cells execute without e
or, or that you delete them before submitting.\n",
"\n",
"### Overview\n",
"You have five sections to complete in this Notebook (total = 100 marks):\n",
"* Part A: Load and Clean Data (20 points)\n",
"* Part B Data Exploration (30 points)\n",
"* Part C: Predicting Spending Levels (20 points)\n",
"* Part D: Predicting Big Spenders (20 points)\n",
"* Part E: Business Recommendations (10 points)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"deletable": false,
"nbgrader": {
"checksum": "8e8e40c5312c2594db509b8e4c9f731d",
"grade": false,
"grade_id": "imports",
"locked": false,
"schema_version": 1,
"solution": true
}
},
"outputs": [],
"source": [
"# add all your imports here.\n",
"# YOUR CODE HERE\n",
"raise NotImplementedE
or()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "
cbe1db68acf76763db3e19b29162d9",
"grade": false,
"grade_id": "cell-56b1c85226f679a1",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"source": [
"---\n",
"# Part A: Load and Clean Data (20 points)\n",
"\n",
"Save your CSV data file into the same folder as this notebook.\n",
"\n",
"Write Python code to load your dataset into a Pandas DataFrame called 'sales'."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"deletable": false,
"nbgrader": {
"checksum": "f1866306460b18ba8285f9073b7870
",
"grade": false,
"grade_id": "read_sales",
"locked": false,
"schema_version": 1,
"solution": true
}
},
"outputs": [],
"source": [
"# YOUR CODE HERE\n",
"raise NotImplementedE
or()"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "465472c32b09a4d2bd97325a77ab7dae",
"grade": false,
"grade_id": "cell-08fd91c8f6a3f1ab",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"source": [
"After you have loaded the data co
ectly, you should have 10,000 rows. \n",
"Run the following cells and tests to check that you have done this co
ectly."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "30e709e48a4861e49bfcd0f34e07af3b",
"grade": false,
"grade_id": "cell-802dd990ff7
39a",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"sales.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "18fa2c34a21ec341e571e978461c2d74",
"grade": true,
"grade_id": "data_loaded",
"locked": false,
"points": 5,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"\"\"\"Check that 'sales' has the right shape and number of rows (5 points).\"\"\"\n",
"assert len(sales.columns) == 10\n",
"assert sales.columns[0] == \"CustNum\"\n",
"assert sales.shape == (10000, 10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "290dd5079318da97a499eb5f4e56e8c0",
"grade": false,
"grade_id": "cell-cbd5370682d8937a",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"source": [
"## Cleaning the Data\n",
"\n",
"Some of the columns are strings, with dollar signs. But we need to convert them to numbers (float) so that we can do calculations on them. The next cell shows what will go wrong if we try doing calculations *before* converting them floats!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "f37fc394053f62b401f14b6d76a4c800",
"grade": false,
"grade_id": "cell-c0f6f29476bf6fc8",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"outputs": [],
"source": [
"s2 = sales[\"Spend\"] * 4\n",
"s2.head()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"deletable": false,
"nbgrader": {
"checksum": "656807b5b835cf1f7968d34cea95417c",
"grade": false,
Answered Same Day Jun 11, 2021

Solution

Sandeep Kumar answered on Jun 17 2021
142 Votes
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"Before you turn this problem in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\\righta
ow$Restart) and then **run all cells** (in the menubar, select Cell$\\righta
ow$Run All).\n",
"\n",
"Make sure you fill in any place that says `YOUR CODE HERE` or \"YOUR ANSWER HERE\". (After you have done that, you can delete the 'raise NotImplementedE
or()' line, and then run your code to check that it works).\n",
"\n",
"Also, enter your NAME in the next cell.\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
],
"source": [
"NAME = \"Ashma Dhakal\""
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "01648f2c7a2b86d733aea6aeb05487e8",
"grade": false,
"grade_id": "jupyter",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"source": [
"# ICT706 SouthBank 2020 Semester 1 Task 2\n",
"\n",
"This assignment will be done completely inside this Jupyter notebook.\n",
"\n",
"### Background\n",
"A medium-size company has given you one year of data about the online purchases that their customers have made. They want you to analyse the data using statistical and machine learning techniques and produce:\n",
"* a prediction algorithm for predicting how much money each customer is likely to spend in a year;\n",
"* a classification algorithm for predicting which customers will be 'big spenders';\n",
"* some recommendations on what marketing strategy they should use to attract more 'big spender' customers.\n",
"\n",
"### Instructions\n",
"Follow all the instructions in this notebook to complete these tasks. Note that some cells contain 'assert' statements - these will automatically mark your work so that you can check that you have done the preceeding steps co
ectly. (If they give e
ors, then go back and co
ect your previous work until you fix those e
ors. Once those 'assert' cells execute without e
ors, you know that you have achieved the marks for that step.) \n",
"\n",
"When you have finished, this notebook is the only file that you will need to submit to Blackboard.\n",
"\n",
"Note: If you want some space to try out some Python code of your own, feel free to add extra cells into this notebook. Just make sure that before you submit your notebook, that those extra cells execute without e
or, or that you delete them before submitting.\n",
"\n",
"### Overview\n",
"You have five sections to complete in this Notebook (total = 100 marks):\n",
"* Part A: Load and Clean Data (20 points)\n",
"* Part B Data Exploration (30 points)\n",
"* Part C: Predicting Spending Levels (20 points)\n",
"* Part D: Predicting Big Spenders (20 points)\n",
"* Part E: Business Recommendations (10 points)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false,
"deletable": false,
"nbgrader": {
"checksum": "8e8e40c5312c2594db509b8e4c9f731d",
"grade": false,
"grade_id": "imports",
"locked": false,
"schema_version": 1,
"solution": true
}
},
"outputs": [
],
"source": [
"# add all your imports here.\n",
"# YOUR CODE HERE\n",
"import pandas as pd\n",
"import numpy as np\n",
"from sklearn.linear_model import LinearRegression\n",
"from sklearn.preprocessing import LabelEncoder\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.svm import SVC\n",
"from sklearn import model_selection\n",
"from sklearn.model_selection import train_test_split"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "
cbe1db68acf76763db3e19b29162d9",
"grade": false,
"grade_id": "cell-56b1c85226f679a1",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"source": [
"---\n",
"# Part A: Load and Clean Data (20 points)\n",
"\n",
"Save your CSV data file into the same folder as this notebook.\n",
"\n",
"Write Python code to load your dataset into a Pandas DataFrame called 'sales'."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false,
"deletable": false,
"nbgrader": {
"checksum": "f1866306460b18ba8285f9073b7870
",
"grade": false,
"grade_id": "read_sales",
"locked": false,
"schema_version": 1,
"solution": true
}
},
"outputs": [
],
"source": [
"# YOUR CODE HERE\n",
"sales = pd.read_csv(\"greenhatsales.csv\")"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false,
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "465472c32b09a4d2bd97325a77ab7dae",
"grade": false,
"grade_id": "cell-08fd91c8f6a3f1ab",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"source": [
"After you have loaded the data co
ectly, you should have 10,000 rows. \n",
"Run the following cells and tests to check that you have done this co
ectly."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false,
"deletable": false,
"editable": false,
"nbgrader": {
"checksum": "30e709e48a4861e49bfcd0f34e07af3b",
"grade": false,
"grade_id": "cell-802dd990ff7
39a",
"locked": true,
"schema_version": 1,
"solution": false
}
},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"