Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

In the assignment we use the data set that is given to us to predict if people put their bikes in specific areas that their bikes will get stolen, lost or it wont. 1) Project Specifications &...

1 answer below »
In the assignment we use the data set that is given to us to predict if people put their bikes in specific areas that their
ikes will get stolen, lost or it wont.


1) Project Specifications & deliverables:
Both the police department and the “general public” would make use of a software product that can give them an
idea about the likelihood of bicycle theft. For the police department it would assist them in taking better measures
of anti-theft around certain neighborhoods. For the public individuals, it would help them assess the need to
additional precautions such as locks.
Based on the dataset described in section four of this document, which is actual data collected over the period of
four years by the Toronto police department. You need to build a predictive service that based on certain features
would provide a classification of either the bike is likely to be stolen or not.
Please a
ange to provide the following deliverables for your project.

1. Data exploration: a complete review and analysis of the dataset including:
a) Load and describe data elements (columns) provide descriptions & types with values of each element. –
use pandas, numpy and any other python packages
) Statistical assessments including means, averages, co
elations
c) Missing data evaluations – use pandas, numpy and any other python packages
d) Graphs and visualizations – use pandas, matplotlib, seaborn, numpy and any other python packages, also
you can use power BI desktop.


2) Data Set
This dataset contains actual Bicycle Thefts occu
ences from XXXXXXXXXXin the city of Toronto.
In accordance with the Municipal Freedom of Information and Protection of Privacy Act, the Toronto Police Service
has taken the necessary measures to protect the privacy of individuals involved in the reported occu
ences. No
personal information related to any of the parties involved in the occu
ence will be released as open data.
The location of crime occu
ences have been deliberately offset to the nearest road intersection node to protect
the privacy of parties involved in the occu
ence. All location data must be considered as an approximate location
of the occu
ence and users are advised not to interpret any of these locations as related to a specific address or
individual.
The reported crime dataset is intended to provide communities with information regarding public safety and
awareness. The data supplied to the Toronto Police Service by the reporting parties is preliminary and may not
have been fully verified.








http:
data.torontopolice.on.ca/datasets/16f2b8a1c76547c69fec14b7f8541ffc_0
Use the download tab and select spreadsheet to download the dataset as a csv file
Answered Same Day Nov 29, 2021

Solution

Ximi answered on Nov 30 2021
130 Votes
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"name": "Untitled1.ipynb",
"provenance": []
},
"kernelspec": {
"name": "python2",
"display_name": "Python 2"
}
},
"cells": [
{
"cell_type": "code",
"metadata": {
"id": "N8cTVfcveg4u",
"colab_type": "code",
"colab": {
"base_uri": "https:
localhost:8080/",
"height": 207
},
"outputId": "d77e0ab1-41f0-4d76-d49c-
2c135bd60f"
},
"source": [
"!wget https:
opendata.arcgis.com/datasets/22bfc3619d69447fadd984fcf77a5550_0.csv"
],
"execution_count": 2,
"outputs": [
{
"output_type": "stream",
"text": [
"--2019-11-29 16:10:20-- https:
opendata.arcgis.com/datasets/22bfc3619d69447fadd984fcf77a5550_0.csv\n",
"Resolving opendata.arcgis.com (opendata.arcgis.com)... 52.55.1.10, 35.153.242.213\n",
"Connecting to opendata.arcgis.com (opendata.arcgis.com)|52.55.1.10|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: unspecified [text/csv]\n",
"Saving to: ‘22bfc3619d69447fadd984fcf77a5550_0.csv’\n",
"\n",
"\r 22bfc3619 [<=> ] 0 --.-KB/s \r 22bfc3619d [ <=> ] 215.59K 854KB/s \r 22bfc3619d6 [ <=> ] 1.71M 3.38MB/s \r22bfc3619d69447fadd [ <=> ] 4.62M 7.76MB/s in 0.6s \n",
"\n",
"2019-11-29 16:10:21 (7.76 MB/s) - ‘22bfc3619d69447fadd984fcf77a5550_0.csv’ saved [4841218]\n",
"\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "code",
"metadata": {
"id": "Z7pskHs67HsP",
"colab_type": "code",
"colab": {}
},
"source": [
"import pandas as pd\n",
"df = pd.read_csv('bicyles.csv')"
],
"execution_count": 0,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"id": "TpUS6MB67jXM",
"colab_type": "code",
"colab": {
"base_uri": "https:
localhost:8080/",
"height": 35
},
"outputId": "3ac71d85-d5ff-4304-bd0f-815d33007014"
},
"source": [
"# print data shape\n",
"print (\"Number of rows and columns\", df.shape)"
],
"execution_count": 7,
"outputs": [
{
"output_type": "stream",
"text": [
"('Number of rows and columns', (17892, 26))\n"
],
"name": "stdout"
}
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "vUHnOvJi8lV8",
"colab_type": "text"
},
"source": [
"### Data Dictionary\n",
"Description about each column is given below\n",
"\n",
"X, Y - Coordinates of where the theft took place (approximate)
\n",
"event_unique_id - unique identifier of theft
\n",
"Primary_Offence - The primary offence type
\n",
"Occu
ence_Date/Day/Month/Yea
Time - The timestamp of the event
\n",
"Division - The division where the event took place
\n",
"City - The city in which the theft took place
\n",
"Location_Type - The location type of the city
\n",
"Premise_Type - The premise type of the location
\n",
"Bike_Make - The bike make
\n",
"Bike_Model - The bike model which was stolen
\n",
"Bike_Type - The bike type
\n",
"Bike_Speed - The bike max speed it had
\n",
"Bike_Colour - The colour of the bike
\n",
"Cost_of_Bike - The cost of the bike as reported
\n",
"Status - The status of the bike if its reported stolen or unknown
\n",
"Neighbourhood - The neighbourhood of the bike
\n",
"Hood_ID - The neighbourhood's Id
\n"
]
},
{
"cell_type": "code",
"metadata": {
"id": "zdg4gT3t7fWJ",
"colab_type": "code",
"colab": {
"base_uri": "https:
localhost:8080/",
"height": 512
},
"outputId": "546d3b18-ec90-48c4-94c8-73bfb4b8e018"
},
"source": [
"# Have a glance at some values from the data itself\n",
"print (\"Top 5 rows from the data\")\n",
"df.head()"
],
"execution_count": 10,
"outputs": [
{
"output_type": "stream",
"text": [
"Top 5 rows from the data\n"
],
"name": "stdout"
...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here