Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

COMP1730/6730 S1 2021 — Project Assignment COMP1730/6730 S1 2021 — Project Assignment Jeffrey Fisher XXXXXXXXXX Important • The assignment is due 9:00 am Monday May 24 (in week 12). • The code for the...

1 answer below »
COMP1730/6730 S1 2021 — Project Assignment
COMP1730/6730 S1 2021 — Project Assignment
Jeffrey Fishe
XXXXXXXXXX
Important
• The assignment is due 9:00 am Monday May 24 (in week 12).
• The code for the assignment can be developed in groups of up to four people.
• The report is individual - you must write it entirely on your own.
• COMP6730 students have to write a more extensive report.
• Include your university ID in every file you submit.
• Include the university ID of every member of your group in any code files you submit.
• This assignment is worth 25% of your grade for COMP1730/COMP6730.
Groups
• The code for the assignment can be completed in groups of up to 4 people. If you wish to work in a
group you should sign-up for one in the sign-up form in Wattle (all members should sign-up to the
same group).
• The report is individual, i.e. you must do it on your own. We will check for plagiarism and othe
suspicious activities in the reports within groups as normal.
• If you do not wish to work with other people, please sign-up to the I will do the assignment on my own
group instead.
• There is no difference in size/scope/marking criteria based on your group size. Unless you have strong
feelings otherwise, we recommend you work in a group of 3 or 4 people.
• COMP6730 and COMP1730 students can be part of the same group.
• You are not limited to members of your tutorial discussion groups - you can form a group with anyone
else enrolled in the course.
• The sign-up link in Wattle is here.
• Group sign-ups will close at 9:00am Monday 10 May. Anyone not signed up to a group at that point
will be added to the I will do the assignment on my own group.
Overview
In this assignment you will be doing a series of data analysis and modelling tasks using some real-world
geographical data. It is different to the homework assignments you have done up until this point in that fo
almost all of the questions, there is no single “right” answer. You are also not given any tests against which
to check your answers (although you are encouraged to write your own to help you test the co
ectness of
your functions). Because there is no single “right” answer, it will be important to justify the decisions and
choices that you make while completing the assignment. This is important since it allows anyone relying
on your results and conclusions to understand how they were obtained and whether they are suitable for a
particular purpose.
The Cotter River provides the ACT with the majority of its water supply. The river stretches over 70
kilometers from the South West edge of the ACT, Northward until it joins with the Mu
umbidgee River,
1
https:
wattlecourses.anu.edu.au/mod/groupselect/view.php?id=2137144
just below the Cotter Dam. In addition to the Cotter Dam, there are two other reservoirs along the Cotte
River, Bendora Dam and Corin Dam. These three dams store the majority of water used in the ACT.
The Cotter Dam (left) and Corin Dam (right) are both overflowing after the wet Summer and Autumn we’ve
had in Canbe
a this year.
For this assignment, we have obtained elevation data for the majority of the Cotter River catchment area,
at a 5 meter resolution. You will be analysing this data and answering some questions about the region,
including some questions that are relevant to Canbe
a’s drinking water supply.
The Data
We have provided you with two csv files - elevation_data_small.csv and elevation_data_large.csv.
These two csv files contain height information on a 5 metre grid. The elevation_data_small.csv file
contains just the Cotter Dam region. The elevation_data_large.csv contains the entire Cotter Rive
catchment area including the Cotter Dam, Bendora Dam and Corin Dam, as well as the su
ounding mountain
anges and the Namadgi National Park. You can see a heatmap of the two data sets in the images below.
Brighter colours are higher elevation. You can also see the Cotter Dam on Google maps here, Bendora Dam
here and Corin Dam here.
Elevation data for the Cotter Dam (left) and Cotter River catchment area (right).
2
https:
www.google.com/maps/place/Cotter+Dam,+Australian+Capital+Te
itory/@ XXXXXXXXXX, XXXXXXXXXX,14.67z/data=!4m5!3m4!1s0x6b17b76c44194b5f:0xc7482c48c1462c8!8m2!3d XXXXXXXXXX!4d XXXXXXXXXX
https:
www.google.com/maps/place/Bendora+Dam/@ XXXXXXXXXX, XXXXXXXXXX,12.63z/data=!4m5!3m4!1s0x6b17c389e536a9cd:0xe832a1de46823c11!8m2!3d XXXXXXXXXX!4d XXXXXXXXXX
https:
www.google.com/maps/place/Corin+Dam,+Cotter+River+ACT+2611/@ XXXXXXXXXX, XXXXXXXXXX,14.67z/data=!4m5!3m4!1s0x6b17db7b770167cf:0x34d44ef9262ceac!8m2!3d XXXXXXXXXX!4d XXXXXXXXXX
The elevation_data_small file looks like this (but with a lot more rows and columns):
693.366,692.038,690.964,690.964,...
693.406,692.079,691.025,691.025,...
693.383,692.039,691.018,691.018,...
693.457,692.085,691.058,691.058,...
693.457,692.107,691.091,691.091,...
... ,... ,... ,... ,...
All elevation values are in meters.
This means that the elevation of the NorthWest most point in the region is 693.366m, the point 5 meters to
the East of it has elevation XXXXXXXXXXmeters and so forth.
If we need to refer to a specific cell in the data, we can do so using its x and y coordinates. We’ll use matrix
style indexing so the origin (x=0, y=0) is the top left grid cell, rather than the bottom left grid cell you might
see on a traditional graph. The coordinate at x = 2 and y = 4 means the 3rd column from the left and the
5th row from the top (highlighted in yellow in Figure 1).
Figure 1: Indexing Example
Be aware that the same location will have different coordinates in the two data sets.
If you are not sure how to read and process CSV files, have a look at Labs 6 and 8 in order to remind yourself.
Please also keep in mind that even the elevation_data_small file is not actually that small, it contains
oughly one million points. You may need to consider efficiency when completing the assignment.
The Task
You are provided with assignment_template.py, which contains the basic functions of the assignment. The
functions are incomplete. In this assignment, you will fill in the blanks and complete the missing functions.
However, we also encourage you to use functional decomposition where appropriate, i.e. you may (and should)
add additional functions as necessary. You will also write a short report about your functions and decisions.
For Questions 1 through 5 you should just make use of the elevation_data_small file. Question 6 requires
you to use the elevation_data_large file as well. Please be aware that if you try and test Questions 1
through 5 using the large data set, you will need to do the cleaning/preparation (described at the start of
Question 6) or you will get nonsensical results.
Question 1: Reading the Data - 10 marks
Write a function that takes the file path of the CSV file as input, reads the file, and returns the data in a
suitable format. The assignment template contains a function for you to fill in:
def read_dataset(filepath):
pass
pass means “do nothing”, and you should remove it when you fill in this function. To load the data, you can
then run
3
data = read_dataset('elevation_data_small.csv')
as long as the CSV file is in the same directory as your assignment file. If it is elsewhere, you’ll need to
provide the file path instead of just the file name.
You should read the data from filepath, and return it in an easy-to-use format. This can be any data
type or data structure that you like, as long as it makes sense for the tasks you will be doing later in this
assignment. You will be using this returned value in all other questions of the assignment, so make sure you
choices here support your later solutions!
Hint - have a look at the remaining questions before deciding on what format to load your data in!
Question 2: Summary Statistics - 10 marks
Now that we have a function to read in the data set, it’s time to do some analysis. We’ll start by calculating
some basic statistics about the data.
There are three function to fill in for Question 2:
def minimum_elevation(data_set):
pass
def maximum_elevation(data_set):
pass
def average_elevation(data_set):
pass
The input to each of these functions should be the data set returned by Question 1. The output, should be
the minimum elevation, the maximum elevation and the average (mean) elevation respectively of the region
covered by the data set. All return values should be in meters.
The minimum and maximum are each worth 3 marks. The average is worth 4 marks.
Question 3: Gradient - 10 marks
The Cotter River valley is pretty rugged country. There are steep mountain ranges on either side of the rive
for most of its length. It’s useful to know how steeply sloped an area is. For example, it would be used when
planning walking or fire trails, risk of landslides, assessing bushfire risk, and so on.
For a given cell we calculate the slope by subtracting the elevation in the cell on its left from the elevation in
the cell on its right - then dividing by 10 (the horizontal distance). This is the x gradient. Then subtract the
elevation in the cell below from the elevation in the cell above, and again divide the result by 10. This is the
y gradient. Square both gradients, add them together, then take the square root. This is the slope, or total
gradient.
Mathematically if ex,y is the elevation for cell (x, y), the slope at cell (x, y) is calculated as:
slopex,y =

((ex+1,y − ex−1,y)/ XXXXXXXXXXex,y+1 − ex,y−1)/10)2
Fill in the function:
def slope(data_set, x_coordinate, y_coordinate):
pass
It should take as inputs the data set (returned by Question 1), an x coordinate and a y coordinate and return
the total gradient at the co
esponding cell.
Hint: You may need to consider the edges of the map separately.
4
Question 4: Surface Area of the Dams - 10 marks
The areas covered by the two elevation data sets are particularly important for Canbe
a’s water supply.
There was a period of time in the not too distant past where the level in all the dams was dangerously low,
esulting in severe restrictions being placed on water usage in the ACT. One way of measuring the water in
the dam is by calculating its extent - or surface area. Our elevation data just contains the elevation at the
surface - regardless of whether it is water or land. However, if we assume that the dam is all approximately
the same level, then as long as we know the elevation of a single point on the dam, we can figure out the
surface area.
Complete the following function in the assignment template:
def surface_area(data_set, x_coordinate, y_coordinate):
Answered 4 days After May 18, 2021

Solution

Shreyan answered on May 21 2021
164 Votes
Modified_Solution/assignment.py
"""
This is the assignment template for the COMP1730/COMP6730 major assignment
for Semester 1, 2021.
The assignment is due at 9:00am on Monday 24 May.
Please include the student IDs of all members of your group here
Student Ids:
"""
import pandas as pd
import numpy as np
import math
import matplotlib.cm as cm
from matplotlib import pyplot as plt
from skimage import data, filters, color, morphology
from skimage.segmentation import flood, flood_fill
import datetime
from sklearn import preprocessing
# Question 1:
def read_dataset(filepath):
df = pd.read_csv(filepath, sep=',')
return df
# Question 2:
def minimum_elevation(data_set):
minValuesCol = data_set.min(axis = 0)
minValue = minValuesCol.min()
return minValue
def maximum_elevation(data_set):
maxValuesCol = data_set.max(axis = 0)
maxValue = maxValuesCol.max()
return maxValue
def average_elevation(data_set):
sumValuesCol = data_set.sum(axis = 0)
sumValue = sumValuesCol.sum()
noOfVals = data_set.size
return (sumValue/noOfVals)
# Question 3
def slope(data_set, x_coordinate, y_coordinate):

if y_coordinate == 0:
e_left = 0
else:
e_left = data_set.iloc[x_coordinate,y_coordinate -1]

if y_coordinate == data_set.shape[1]-1:
e_right = 0
else:
e_right = data_set.iloc[x_coordinate][y_coordinate +1]

if x_coordinate == 0:
e_up = 0
else:
e_up = data_set.iloc[x_coordinate-1][y_coordinate]

if x_coordinate == data_set.shape[0]-1:
e_down = 0
else:
e_down = data_set.iloc[x_coordinate+1][y_coordinate]

e_gradient = math.sqrt(((e_left - e_right)/10)**2 + ((e_up - e_down)/10)**2)
return e_gradient

# Question 4
def surface_area(data_set, x_coordinate, y_coordinate):

df = np.zeros(data_set.shape, dtype = 'int')
e = data_set.iloc[x_coordinate,y_coordinate]
tol = 10
area = 0

m = data_set.shape[0] -1
n = data_set.shape[1]-1
# creating a bi-modal image based on threshold = tol

for i in range(0,m):
for j in range(0, n):
df[i][j] = 255 if ((data_set.iloc[i][j] >= e-tol) or (data_set.iloc[i][j] <= e+tol)) else 0
df = flood_fill(df, (x_coordinate, y_coordinate), 127)

# Calculating the area

for i in range(1,m):
for j in range(1, n):
if df[i][j] == 127:
area = area +1
return area
# Question 5:
def expanded_surface_area(data_set, water_level, x_coordinate, y_coordinate):

df = np.zeros(data_set.shape, dtype = 'float')

area = 0
m = data_set.shape[0] -1
n = data_set.shape[1]-1

# creating a bi-modal image based on threshold = water_level

for i in range(0,m):
for j in range(0, n):
df[i][j] = 255 if (data_set.iloc[i][j] <= water_level) else 0

df = flood_fill(df, (x_coordinate, y_coordinate), 127)

# Calculating the area

for i in range(1,m):
for j in range(1, n):
if df[i][j] == 127:
area = area + 1

plt.imshow(df, cmap = 'gray') # plots the diagram of the catchment area
return area
# Question 6:
def impute_missing_values(data_set):

data_set = data_set.applymap(clean_data)
return data_set
def clean_data(item):
if item < 0:
return 1600
else:
return item
# You'll need to decide what other functions you want for Question 6
# It should be clear from your code, what we need to do in order to produce the plot(s).
def produce_plot(data_set, x, y):
max = maximum_elevation(data_set)
df1 = flood_fill(data_set, (x, y), 1.5*max, tolerance=10)
plt.imshow(df1, cmap = 'gray')
# Code in the following if statement will only be executed when this file is run - not when it is imported.
# If you want to use any of your functions (such as to answer questions) please write the code to
# do so inside this if statement. We'll cover it in more detail in an upcoming lecture.
if __name__ == "__main__":
filename = 'elevation_data_small.csv'
data_set = read_dataset(filename)

# Finding the minimum, maximum and...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here