A1b_SQLQueries
School of Accounting, Information Systems and Supply Chain Semester XXXXXXXXXX
Business Data Management and Analytics v1.1 Page 1 of 3
elongs to belong
for
has
has
has
SCHOOL OF AISSC
Business Data Management and Analytics
Assignment 1 – SQL Queries – Part B
Due Date: End of Week 8 (Friday 18th September, 2020)
CASE STUDY
Relational Model
PEOPLE STATE COUNTRYSTATS DIAGNOSIS OCCURRENCE
People_id State_id CountryStats_id Diagnosis_id Occu
ence_id
Firstname Name life_expectancy Occu
ence_id Date_of_occu
ence
Lastname Country_id population People _id State_id
DOB population_density Outcome_id Virus_id
Sex gdp_per_capita Total_confirmed
extreme_poverty OUTCOME Total_recovered
CONTACT COUNTRY cardiovasc_death_rate Outcome_id Total_deaths
Contact_id Country_id diabetes_prevalence Description Total_active
From_people_id Name total_tests
To_people_id language median_age CLUSTER VIRUS
Cluster_id Continent aged_65_older Cluster_id Virus_id
Iso_code aged_70_older Name Name
Country_id
Entity-Relationship Diagram
Business Rules
• If cluster_id in CONTACT table is NULL then unknown origin; otherwise belongs known cluster.
• Occu
ence is a given day on which a specific VIRUS in each STATE/Location has diagnosed
cases of people for a given outcome.
• CONTACT table shows PEOPLE who have encountered other PEOPLE for more than 15
minutes.
PEOPLE
DIAGNOSIS
COUNTRY
OCCURRENCE
have
STATE
VIRUS
CLUSTER contacts
COUNTRYSTATS
OUTCOME
School of Accounting, Information Systems and Supply Chain Semester XXXXXXXXXX
Business Data Management and Analytics v1.1 Page 2 of 3
SPECIFICATIONS
Read the following questions carefully. You will be asked to specify SQL queries to answer them.
QUESTIONS
You will be working with a set of tables for a PANDEMIC database. You can access these
tables by using the PANDEMIC database on the MySQL server (mo.its.rmit.edu.au). You are to
prepare 11 SQL query statements and 2 visualisations that will provide answers to the following
12 requests.
(1 marks per question)
1. Create a view in your database (you cannot create view in the pandemic database!) called
AustraliaOccu
ences; This view will contain a list of all occu
ences that belong to the
country Australia. Please provide the following information in this view: number of days
since the pandemic start in Australia (note: use the “first_occurence” view to get start date
for Australia) that this occu
ence occu
ed, the occu
ence date, state name, longitude and
latitude, and the total confirmed.
2. Improve question 3 from assignment 1a, “What is the status of a person, whose people_id is
1000? Provide all the diagnosis information for this person."; Provide the name of the
person, in the format of “Surname, Given”, the description of the outcome of the diagnosis,
the date of occu
ence for each diagnosis and country name.
3. Improve question 9 from assignment 1a, “Based on the state_id, show the number of
ecorded occu
ences that occu
ed for each state_id. Also, show the total deaths for that
state_id. Show only the states that have had more than 150 days of occu
ences (Note: an
occu
ence is one days reported figures for a virus).”. Add the state name and country name,
please provide this in the following format, state_name (country_name), e.g. Victoria
(Australia).
4. Show a list of all people who have been confirmed diagnosis in Victorian (Australia). Show
the first initial of the first name with a full stop and the last name, for example: F. Flintstone,
include the age of the person, the gender (show as “Male” or “Female”, if unspecified show
“N/A”) and the date they were confirmed to have the virus.
5. Based on the data in the diagnosis table, provide a list of people by their age, for each age
show a count of confirmed cases. Show the list from youngest to oldest age.
6. List all the outcomes and a count of the number of diagnosis for each. Please include all
outcomes in this query, even the ones with no diagnoses cases.
7. Based on the “delectus” cluster, show all the people that have encounter each other within
this cluster. Show their full name.
8. Show the cluster that has the MOST contacts. Show only the cluster id and cluster name for
this cluster.
9. Show the oldest and youngest female person in the database. Show their full name and age.
10. Create a report that shows for each cluster which state and country are involved. Show the
name of the cluster and the state and country like this: state_name (country name). Show
this data in cluster name order and then state name order.
(2 marks)
11. Choose one questions (from questions 1-10 of this assignment) and create a visualisation,
using Excel, Tableau or Orange. Attach the created image ONLY to your submission. The
visualisation will be judged on how well it represents the question chosen, how clear the
various axis and legends are label, and the appropriateness of the title of the visualisation.
School of Accounting, Information Systems and Supply Chain Semester XXXXXXXXXX
Business Data Management and Analytics v1.1 Page 3 of 3
(3 marks)
12. Produce a report of your own design and write a query to solve it. Marks will be awarded
for report design (ie. How useful is the report), complexity of the query and originality.
Please provide:
a) Business question
) SQL query
c) Visualisation using Excel, Tableau or Orange (attached image only to submission)
Note: you can not use the MySQL LIMIT operator, because it is NOT standard SQL.
Note: when you run some SQL queries, the data will change on a daily basis, because the database
is being update for certain countries on a regular basis.
REQUIREMENTS
11 SQL queries that answer the questions asked, based on the data model and
implemented database (on mo.its.rmit.edu.au) provided.
2 visualisations (using Excel, Tableau or Orange) will be created and an image will be
submitted
ASSESSMENT
Assessment of the data model will be based on the following areas (by the tutor):
• How well the query answers the questions, in relation to the case study provided.
• Understanding of data structure
• Efficiency and simplicity of resulting query
DEMONSTRATION
Selected students will be required to attend a demonstration session where they will be
asked to demonstrate and explain the queries they have written and to write several new
queries for the same database. Failure to explain the queries written or the inability to
write new queries will result in a FAIL mark being recorded for assignment 1b.
SUBMISSION
• SQL queries (output not required) only.
• Screen dump of Visualisation (can be in a word document or separate image files)
• Assignment will be submitted online using the learning hub.
ISYS2421
School of Accounting, Information Systems and Supply Chain Semester XXXXXXXXXX
Business Data Management and Analytics v2.1 Page 1 of 3
elongs to belong
for
has
has
has
ISYS2421 - Business Data Management and Analytics
Assignment 1 – SQL Queries – Part A
Due Date: End of Week 3 (Friday 7th August, 2020)
CASE STUDY
Relational Model
PEOPLE STATE COUNTRYSTATS DIAGNOSIS OCCURRENCE
People_id State_id CountryStats_id Diagnosis_id Occu
ence_id
Firstname Name life_expectancy Occu
ence_id Date_of_occu
ence
Lastname Country_id population People _id State_id
DOB population_density Outcome_id Virus_id
Sex gdp_per_capita
extreme_poverty OUTCOME CLUSTER
CONTACT COUNTRY cardiovasc_death_rate Outcome_id Cluster_id
Contact_id Country_id diabetes_prevalence Description Name
From_people_id Name total_tests
To_people_id language median_age VIRUS
Cluster_id Continent aged_65_older Virus_id
Iso_code aged_70_older Name
Country_id
Entity-Relationship Diagram
Business Rules
• If cluster_id in CONTACT table is NULL then unknown origin; otherwise belongs known cluster.
• Occu
ence is a given day on which a specific VIRUS in each STATE/Location has diagnosed
cases of people for a given outcome.
• CONTACT table shows PEOPLE who have encountered other PEOPLE for more than 15 minutes.
Data in the Occu
ence, State, Country and Countrystats tables are extracted from this site: https:
ourworldindata.org
PEOPLE
DIAGNOSIS
COUNTRY
OCCURENCE
have
STATE
VIRUS
CLUSTER contacts
COUNTRYSTATS
OUTCOME
https:
ourworldindata.org
School of Accounting, Information Systems and Supply Chain Semester XXXXXXXXXX
Business Data Management and Analytics v2.1 Page 2 of 3
SPECIFICATIONS
Read the following questions carefully. You will be asked to specify SQL queries to answer them.
QUESTIONS
You will be working with a set of tables for a PANDEMIC database. You can access these
tables by using the PANDEMIC database on the mysql server (mo.its.rmit.edu.au). You are to
prepare 10 SQL query statements that will provide answers to the following 10 requests for
information. Please NOTE: all these queries require only ONE TABLE in the FROM clause.
1. Show a list of all the countries that belong to the Oceania continent. Show the country name
and the international standards office code. Sort the data shown in country name ascending
order.
2. Business is looking for all the people we have in the database whose surname has “berg” in
it and is between 49 and 51 years old. Show the persons full name, gender and their age.
3. What is the status of a person, whose people_id is 1000? Provide all the diagnosis
information for this person.
4. The State_id 58 is Victoria, Australia. Provide a list of deaths for Victoria in date of
occu
ence order, from latest death figures to the older death figures. Show the date of
occu
ence and number of deaths.
5. Display the occu
ences that occu
ed on a Monday and had 10 or more deaths. Select only
the state_id where it is equal to the state_id is for the state of “Lombardia” is (you will need
to find this state_id from the state table). Show the occu
ence date, number of confirmed
cases, number of people recovered, and the number