This is an individual coding assignment. The objective is to implement the R-tree. Each submission will be graded based on correctness. The rest of the document explains the details. How Your Submission Will Be Tested: [Dataset]: You will be given a dataset which contains 2D points. The dataset will be provided in a text file as the following format: n id 1 x 1 y 1 id 2 x 2 y 2 ... id n x n y n Specifically, the first line gives the number of points in the dataset. Then, every subsequent line gives a point’s id, x-, and y-coordinates. Your program should build an R-tree in memory from the dataset. [Range Query]: You will be given a set of 100 range queries in a text file whose format is: x 1 x’ 1 y 1 y’ 1 x 2 x’ 2 y 2 y’ 2 ... x 100 x’ 100 y 100 y’ 100 That is, each line specifies a query whose rectangle is [x, x′ ] × [y, y′ ]. Then, we will measure its query efficiency as follows. You should output to a disk file: • Firstly, your program should display the time of answering queries by reading the entire dataset sequentially. This time serves as the sequential-scan benchmark to be compared with the cost of your query algorithms that leverage the R-tree. • Secondly, display the number of points returned by each query-note: we need only the number of points retrieved, instead of the details of those points. • Thirdly, display the total running time of answering all the 100 queries, and the average time of each query (i.e., divide the total running time by 100). [Programming Language]: Python, Java, C++ (including variants like C, C#, ...), or any other 1 language approved by the instructor. You can implement the R-tree by using the existing libraries provided in the programming language of your choice (i.e., some standard libraries or the libraries for R-Tree). [Deliverables]: Your submission includes the following components: 1. Source Code: The code you have developed yourself. Make sure your code can be run in the standard general programming environment. 2. Report: Your report should include the following: • A brief description of the main functions in your source code; • A clear specification of the requirements for executing your code such as, OS environment, placement of input files, any input parameters, etc. 3. Zip all your code and report into a single file, and name the file in the following format: yourstudentid surname.zip. Marking: Your total mark earned for this assignment is based on: • [Queries: 60 marks] – Correctness: 50 marks. ∗ [Sequential-Scan Based Method (10 marks)]: If your program correctly answers m (out of 100) queries by reading the entire dataset (reading all the data points) sequentially, you get 10 · (m/100) marks for this part. ∗ [R-Tree Based Method (40 marks)]: If your program correctly answers m (out of 100) queries by searching the R-Tree, you get 40 · (m/100) marks for this part. – Efficiency: 10 marks. If the average query time is at least 5 times faster than sequential scan, you get 10 marks for this part. If at least 2 times faster (but less than 5 times), you get 5 marks. If less than 2 times faster, no marks. • [The Report: 40 marks] – Function Description: 30 marks. If your report includes a clear description of all the functions in your source code, you get 30 marks. If only part of your functions is introduced, you will be given the marks based on the proportion of the correct answers. – Requirement Description: 10 marks. If your report includes a clear description of the requirements for executing your code such as, OS environment, placement of input files, any input parameters, etc, and your report includes the screenshots of the running results (e.g., the average execution time of both sequential-scan and R-Tree based methods, etc.), you get 10 marks. • [Bonus: 10 marks] – Implementing the R-Tree by Using Standard Libraries Only (5 marks). Students are encouraged to implement the R-Tree by using standard libraries provided by the program languages rather than using the existing R-Tree libraries. If you can correctly implement the R-Tree without the help of the existing R-Tree libraries, you get 5 marks as the bonus. – Analysing the Working of R-Tree: (5 marks). In addition to coding, students are encouraged to provide a high-quality report that contains a detailed analysis of the working of R-Tree. You need to select no less than 10 data points from the given dataset, and one query from the given queries. Then, if you can clearly and correctly analyse the process of the R-Tree construction and the query process (the search should traverse 2 several nodes of the tree, and during the construction of the R-Tree, there should be an overflow and a node splitting), you get 5 marks as the bonus. • [Note:] Your final grade=[Queries]+[The Report]+[Bonus]. If the sum of the three items is greater than 100, you get the full marks, say 100 (i.e., min{[Queries]+[The Report]+[Bonus], 100}).

Answered Same DayOct 10, 2021

Functions.txt

import pandas as pd

import sys

import math

B = 4

class NodeManager(object):

def __init__(self):

self.id = 0

self.child_nodes = []

self.data_points = []

self.parent = None

self.MBR = {

'x_min': -1,

'y_min': -1,

'x_max': -1,

'y_max': -1,

}

def calculate_perimeter(self):

return (self.MBR['x_max'] - self.MBR['x_min']) + (self.MBR['y_max'] - self.MBR['y_min'])

def check_underflow(self):

if self.check_leaf():

if self.data_points.__len__() < math.ceil(B / 2):

return True

else:

return False

else:

if self.child_nodes.__len__() < math.ceil(B / 2):

return True

else:

return False

def check_overflow(self):

if self.check_leaf():

if self.data_points.__len__() > B:

return True

else:

return False

else:

if self.child_nodes.__len__() > B:

return True

else:

return False

def check_root(self):

if self.parent is None:

return True

else:

return False

def check_leaf(self):

if self.child_nodes.__len__() == 0:

return True

else:

return False

class DataLoader():

def __init__(self):

pass

self.rt = RTree()

def load_datapoints(self, points_path):

points = open(points_path, "r")

lines = points.readlines()

lines = lines[1:]

x = []

y = []

for i in range(len(lines)):

# for i in range(100):

x.append(int(lines[i].split()[1]))

y.append(int(lines[i].split()[2]))

points = pd.DataFrame({'x': x, 'y': y})

return points

def load_query(self, queries_path):

query_points = open(queries_path, "r")

query_coordinates = query_points.readlines()

x_min = []

x_max = []

y_min = []

y_max = []

for i in range(len(query_coordinates)):

# for i in range(100):

x_min.append(int(query_coordinates[i].split()[0]))

x_max.append(int(query_coordinates[i].split()[1]))

y_min.append(int(query_coordinates[i].split()[2]))

y_max.append(int(query_coordinates[i].split()[3]))

queries = pd.DataFrame({'x_min': x_min, 'x_max': x_max, 'y_min': y_min, 'y_max': y_max, })

return queries

class RTree(object):

def __init__(self):

self.root = NodeManager()

def query(self, node, query):

num = 0

if node.check_leaf():

for point in node.data_points:

if self.is_covered(point, query):

num = num + 1

return num

else:

for child in node.child_nodes:

if self.is_intersect(child, query):

num = num + self.query(child, query)

return num

def is_intersect(self, node, query):

center1_x = (node.MBR['x_max'] + node.MBR['x_min']) / 2

center1_y = (node.MBR['y_max'] + node.MBR['y_min']) / 2

length1 = node.MBR['x_max'] - node.MBR['x_min']

width1 = node.MBR['y_max'] - node.MBR['y_min']

center2_x = (query['x_max'] + query['x_min']) / 2

center2_y = (query['y_max'] + query['y_min']) / 2

length2 = query['x_max'] - query['x_min']

width2 = query['y_max'] - query['y_min']

if abs(center1_x - center2_x) <= length1 / 2 + length2 / 2 and\

abs(center1_y - center2_y) <= width1 / 2 + width2 / 2:

return True

else:

return False

def is_covered(self, point, query):

x_min, x_max, y_min, y_max = query['x_min'], query['x_max'], query['y_min'], query['y_max']

if x_min <= point['x'] <= x_max and y_min <= point['y'] <= y_max:

return True

else:

return False

def insert(self, u, p):

if u.check_leaf():

self.add_data_point(u, p)

if u.check_overflow():

self.handle_overflow(u)

else:

v = self.choose_subtree(u, p)

self.insert(v, p)

self.update_m

(v)

def choose_subtree(self, u, p):

if u.check_leaf():

return u

else:

min_increase = sys.maxsize

best_child = None

for child in u.child_nodes:

if min_increase >...

import pandas as pd

import sys

import math

B = 4

class NodeManager(object):

def __init__(self):

self.id = 0

self.child_nodes = []

self.data_points = []

self.parent = None

self.MBR = {

'x_min': -1,

'y_min': -1,

'x_max': -1,

'y_max': -1,

}

def calculate_perimeter(self):

return (self.MBR['x_max'] - self.MBR['x_min']) + (self.MBR['y_max'] - self.MBR['y_min'])

def check_underflow(self):

if self.check_leaf():

if self.data_points.__len__() < math.ceil(B / 2):

return True

else:

return False

else:

if self.child_nodes.__len__() < math.ceil(B / 2):

return True

else:

return False

def check_overflow(self):

if self.check_leaf():

if self.data_points.__len__() > B:

return True

else:

return False

else:

if self.child_nodes.__len__() > B:

return True

else:

return False

def check_root(self):

if self.parent is None:

return True

else:

return False

def check_leaf(self):

if self.child_nodes.__len__() == 0:

return True

else:

return False

class DataLoader():

def __init__(self):

pass

self.rt = RTree()

def load_datapoints(self, points_path):

points = open(points_path, "r")

lines = points.readlines()

lines = lines[1:]

x = []

y = []

for i in range(len(lines)):

# for i in range(100):

x.append(int(lines[i].split()[1]))

y.append(int(lines[i].split()[2]))

points = pd.DataFrame({'x': x, 'y': y})

return points

def load_query(self, queries_path):

query_points = open(queries_path, "r")

query_coordinates = query_points.readlines()

x_min = []

x_max = []

y_min = []

y_max = []

for i in range(len(query_coordinates)):

# for i in range(100):

x_min.append(int(query_coordinates[i].split()[0]))

x_max.append(int(query_coordinates[i].split()[1]))

y_min.append(int(query_coordinates[i].split()[2]))

y_max.append(int(query_coordinates[i].split()[3]))

queries = pd.DataFrame({'x_min': x_min, 'x_max': x_max, 'y_min': y_min, 'y_max': y_max, })

return queries

class RTree(object):

def __init__(self):

self.root = NodeManager()

def query(self, node, query):

num = 0

if node.check_leaf():

for point in node.data_points:

if self.is_covered(point, query):

num = num + 1

return num

else:

for child in node.child_nodes:

if self.is_intersect(child, query):

num = num + self.query(child, query)

return num

def is_intersect(self, node, query):

center1_x = (node.MBR['x_max'] + node.MBR['x_min']) / 2

center1_y = (node.MBR['y_max'] + node.MBR['y_min']) / 2

length1 = node.MBR['x_max'] - node.MBR['x_min']

width1 = node.MBR['y_max'] - node.MBR['y_min']

center2_x = (query['x_max'] + query['x_min']) / 2

center2_y = (query['y_max'] + query['y_min']) / 2

length2 = query['x_max'] - query['x_min']

width2 = query['y_max'] - query['y_min']

if abs(center1_x - center2_x) <= length1 / 2 + length2 / 2 and\

abs(center1_y - center2_y) <= width1 / 2 + width2 / 2:

return True

else:

return False

def is_covered(self, point, query):

x_min, x_max, y_min, y_max = query['x_min'], query['x_max'], query['y_min'], query['y_max']

if x_min <= point['x'] <= x_max and y_min <= point['y'] <= y_max:

return True

else:

return False

def insert(self, u, p):

if u.check_leaf():

self.add_data_point(u, p)

if u.check_overflow():

self.handle_overflow(u)

else:

v = self.choose_subtree(u, p)

self.insert(v, p)

self.update_m

(v)

def choose_subtree(self, u, p):

if u.check_leaf():

return u

else:

min_increase = sys.maxsize

best_child = None

for child in u.child_nodes:

if min_increase >...

SOLUTION.PDF## Answer To This Question Is Available To Download

- 3/19/22, 12:17 AM Module 19 Challenge https://courses.bootcampspot.com/courses/967/assignments/19045?module_item_id= XXXXXXXXXX/18 Module 19 Challenge Due Sunday by 23:59 Points 100 Submitting a text...SolvedMar 19, 2022
- 3/9/22, 11:46 PM Module 18 Challenge https://courses.bootcampspot.com/courses/967/assignments/19043?module_item_id= XXXXXXXXXX/19 Module 18 Challenge Due Sunday by 23:59 Points 100 Submitting a text...SolvedMar 10, 2022
- Background Since your work with Jennifer on the SellBy project was so successful, you’ve been tasked with another, larger project: analyzing Amazon reviews written by members of the paid Amazon Vine...SolvedMar 02, 2022
- Background Jill commends you for all your hard work. Piece by piece, you’ve been building up your skills in data preparation, statistical reasoning, and machine learning. You are now ready to apply...SolvedMar 02, 2022
- HR Metrics: aSSIGNment 2 (15%) PROJECT SUMMARY This is an individual project. You are to create these reports based on the dataset within the Assignment 2 folder: 1. a corporate profile report 2. a...SolvedFeb 10, 2022
- Background Basil and Sadhana like how you created your earthquake map with two different maps and the earthquake overlay. Now, Basil and Sadhana would like to see the earthquake data in relation to...SolvedFeb 02, 2022
- Background Roza has a partially completed dashboard that she needs to finish. She has a completed panel for demographic information and now needs to visualize the bacterial data for each volunteer....SolvedJan 25, 2022
- Consider the following similar problems to be solved in MapReduce paradigm. We have two text files as follows: File1: Every vaccinated person has good protection against the virus. A vaccinated person...SolvedJan 25, 2022
- A1: Text Adventure Game Development Due Thursday by 10pm Points 100 Submitting a file upload File Types ipynb Available until Jan 27 at 10pm Start Assignment Text Adventure Game Development In this...SolvedJan 24, 2022
- 1/18/22, 11:15 AM Module 11 Challenge Module 11 Challenge Start Assignment Due 23 Jan by 23:59 Points 100 Submitting a text entry box or a website url Background Dana’s webpage and dynamic table are...SolvedJan 18, 2022

Copy and Paste Your Assignment Here

Disclaimer: The reference papers provided by TAE serve as model papers for students and are not to be submitted as it is. These papers are intended to be used for research and reference purposes only.

Copyright © 2022. All rights reserved.