1 | P a g e
ITECH1103- Big Data and Analytics
Group Assignment – Semester 2, 2018
Worth – 30%
ANALYTIC REPORT (20%- Due Week 11 Sunday
11:55pm) and PRESENTATION (10% - Due Week
10 in Tutorial Time)
Analytic Report:
Learning Outcomes Assessed: A3, K3, K6, and S2:
Purpose: The purpose of this task is to provide students with practical experience in
working in teams to write a Data Analytical report to provide useful insights, pattern and
trends in the chosen/given dataset. This activity will give students the opportunity to
show innovation and creativity in applying Watson Analytics and designing useful
visualization solutions and predictive solutions for various analytics problems.
Group Presentation: Week 10 (Scheduled Laboratory)
Learning Outcomes Assessed: K4, A1, A2, V1, V2
Purpose: The purpose of the oral presentation is to provide an opportunity for students
to present the results of DATA Analysis and to share this knowledge while practicing
their ve
al communication skills
Project Details: Your task for this analytical project is to use analytical tool (i.e Watson
Analytics) to explore, analyze and visualize one of the two given dataset. Your tutor will
assign you the dataset. This dataset reflects reported incidents of crime (with the
exception of murders where data exists for each victim) that occu
ed in the City of
Chicago from 2012. Data is extracted from the Chicago Police Department's CLEAR
(Citizen Law Enforcement Analysis and Reporting) system. In order to protect the
privacy of crime victims, addresses are shown at the block level only and specific
locations are not identified.Your intended audience is a law enforcement agency’s
middle and top middle management. Your primary goal is to provide different and
interesting insights in the lights of 20 questions listed below. The dataset could be
downloaded from the following link
Data Sets:
Dataset 1 -
https:
data.world/mchadha
chicagocrime-dataset
Dataset 2 -
https:
data.world/mchadha
dataset-2-chicago-crime
https:
data.world/mchadha
dataset-2-chicago-crime
2 | P a g e
Data Dictionary:
ID - Unique identifier for the record.
Case Number - The Chicago Police Department RD Number (Records Division Number),
which is unique to the incident.
Date - Date when the incident occu
ed. this is sometimes a best estimate.
Block - The partially redacted address where the incident occu
ed, placing it on the
same block as the actual address.
IUCR - The Illinois Unifrom Crime Reporting code. This is directly linked to the Primary
Type and Description. See the list of IUCR codes
at https:
data.cityofchicago.org/d/c7ck-438e.
Primary Type - The primary description of the IUCR code.
Description - The secondary description of the IUCR code, a subcategory of the primary
description.
Location Description - Description of the location where the incident occu
ed.
A
est - Indicates whether an a
est was made.
Domestic - Indicates whether the incident was domestic-related as defined by the
Illinois Domestic Violence Act.
Beat - Indicates the beat where the incident occu
ed. A beat is the smallest police
geographic area – each beat has a dedicated police beat car. Three to five beats make up
a police sector, and three sectors make up a police district. The Chicago Police
Department has 22 police districts. See the beats
at https:
data.cityofchicago.org/d/aerh-rz74.
District - Indicates the police district where the incident occu
ed. See the districts
at https:
data.cityofchicago.org/d/fthy-xz3r.
Ward - The ward (City Council district) where the incident occu
ed. See the wards
at https:
data.cityofchicago.org/d/sp34-6z76.
Community Area - Indicates the community area where the incident occu
ed. Chicago
has 77 community areas. See the community areas
at https:
data.cityofchicago.org/d/cauq-8yn6.
FBI Code - Indicates the crime classification as outlined in the FBI's National Incident-
Based Reporting System (NIBRS). See the Chicago Police Department listing of these
classifications at http:
gis.chicagopolice.org/clearmap_crime_sums/crime_types.html.
X Coordinate - The x coordinate of the location where the incident occu
ed in State
Plane Illinois East NAD 1983 projection. This location is shifted from the actual location
for partial redaction but falls on the same block.
Y Coordinate - The y coordinate of the location where the incident occu
ed in State
Plane Illinois East NAD 1983 projection. This location is shifted from the actual location
for partial redaction but falls on the same block.
Year - Year the incident occu
ed.
Month- Month the incident occu
ed.
Day – Day the incident occu
ed
Updated On - Date and time the record was last updated.
Latitude - The latitude of the location where the incident occu
ed. This location is
shifted from the actual location for partial redaction but falls on the same block.
Longitude - The longitude of the location where the incident occu
ed.
https:
data.cityofchicago.org/d/c7ck-438e
https:
data.cityofchicago.org/d/aerh-rz74
https:
data.cityofchicago.org/d/fthy-xz3
https:
data.cityofchicago.org/d/sp34-6z76
https:
data.cityofchicago.org/d/cauq-8yn6
http:
gis.chicagopolice.org/clearmap_crime_sums/crime_types.html
3 | P a g e
This location is shifted from the actual location for partial redaction but falls on the
same block.
Location - The location where the incident occu
ed in a format that allows for creation
of maps and other geographic operations on this data portal. This location is shifted from
the actual location for partial redaction but falls on the same block.
You are expected to present the data findings in a visual forms (i.e., charts and graphs).
This is a group assignment. You will complete it with your team (max 3 members
enrolled in the same laboratory). It is expected that each team member will contribute
equally in the project. Each team will turn in one joint document and give a joint
presentation in Timetabled Laboratory class in Week 10. In addition, each individual
team member will write a short reflection as part of the report. You will receive feedback
on the draft about presentation choices, content, analysis, and style.
The Questions
Your job is to examine one of the available datasets and present it in a set of
informative graphs and text by answering the following questions.
Guided Questions for Dataset 1
1. How many total number of reported crimes?
2. How many different number of reported crimes types? (Primary type)
3. Provide a list of top 21 location descriptions with respect to crimes.
4. Provide a list of least 10 location descriptions with respect to crimes.
5. What is the top three most common primary type?
6. What are the three least common primary types?
7. How many years of Years of reported crimes is in the data file?
8. How many number of reported crimes were logged every year in December?
9. Which year generated the most reported crime in Chicago?
10. Which month generated the most reported crime in Chicago?
11. How many number of reported crimes whether an a
est was made? (A
est)
12. How many number of district in this dataset?
13. What are the top 3 districts in terms of reported crimes?
14. What are the least 3 districts in terms of reported crimes?
15. What was the primary type that reported most crimes from district “8” in 2014?
16. How many number of domestic reported crimes made in Chicago?
17. How many domestic number of reported crimes were made in 2012 to 2014?
18. Which day is the busiest day of the week in terms of committed crimes?
19. Which location description has the most number of crime reported on
Weekends?
20. Which location description has the least number of crime reported on
weekends?
Guided Questions for Dataset 2
1. How many total number of reported crimes?
2. How many different number of reported crimes types? (primary type)
3. How many location descriptions of reported crimes? (location description)
4 | P a g e
4. What are the top three most common primary type that reported crimes ?
5. What are the three least common primary types?
6. How many years of Years of reported crimes is in the data file?
7. How many number of reported crimes were logged in the last week of the
dataset? Considered (11 of January 2017 , 18th of January 2017)
8. Which year generated the most reported crime in Chicago?
9. Which month generated the most reported crime in Chicago?
10. How many number of reported crimes whether an a
est was made? (a
est)
11. How many number of district in this dataset
12. Which District in Chicago reported most crimes in the last year? (last year of
dataset)
13. Which District in Chicago reported least crimes in the last year? (last year of
dataset)
14. What was the primary type that reported most crimes from district “8” last year?
15. How many number of domestic reported crimes made in Chicago?
16. How many domestic number of reported crimes were made over the past
month? (last year of dataset)
17. Which location description has the most number of crime reported in Chicago?
18. Which location description has the least number of crime reported in Chicago?
19. Which location description has the most number of crime reported on
Weekends?
20. Which location description has the least number of crime reported on
weekends?
Task 1- Background information
Write a description of the selected dataset and project, and its importance for the firm.
Information must be appropriately referenced. [1 Page]
Task 2 – Reporting / Dashboards
For your project, perform the relevant data analysis tasks by answering the above
questions and, identify the visualization and dashboards you need to develop for the
operational manager of the indicated firm. [2-3 Pages]
Task 3 – Advanced Insights: In addition to the guided questions, it is expected to
provide at least five (5) insights of the data. These insights will be judged in terms of
quality and complexity.
Task 4 – Research
Justify why these BI reporting solution/dashboards are chosen in Task 2 (Reporting /
Dashboards) and why those data sets attributes are present and laid out in the fashion