Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Advanced Data Structures: Assignment 1 DrugBank and Drug Design In this assignment, we play with the DrugBank data to build a graph. You are expected to finish the following tasks. 2 Your Tasks...

1 answer below »
Advanced Data Structures: Assignment
1 DrugBank and Drug Design
In this assignment, we play with the DrugBank data to build a graph. You are expected to finish the following
tasks.
2 Your Tasks (Total: 12 marks)
You should define classes named Vertex and DrugGraph with the following requirements (feel free to define
extra variables, classes, and methods if needed).
1. You should define class Vertex with data attributes drugBankID, (this is used as node label), genericName,
SMILES, url, drugGroups, and score, wasVisited (or known), dist, and path. This class has at least
one method, named displayDrug to print out the information on the screen. (0.5 mark)
2. Define a method named readData under the DrugGraph class to realize the following functionality:
• Load all the provided information from the given text file dockedApproved.tab to an a
ay variable
named vertices which is an attribute of the DrugGraph class. Each element in a
ay vertices
is an object of class Vertex. You should give reasonable initial values to variables wasVisited,
dist, and path. (0.5 mark)
• Load the provided data from the given text file sim mat.tab to an a
ay variable named W which
is also an attribute of the DrugGraph class. Note that, the data in sim mat.tab is the similarity
matrix of the drugs, where the value at the i-th row and the j-th column is the Tanimoto similarity
etween the i-th drug and the j-th drug; the row and column orders of drugs are same as the drug
order in file dockedApproved.tab. Since the values in sim mat.tab are similarities, we need to
convert these values into dissimilarities (i.e., distances) in order to have proper values in a
ay W.
Suppose a value in the file is denoted by vij , the co
esponding value in matrix W should be
wij =
{
1− vij if (1− vij) ≤ threshold and i 6= j
∞ otherwise
, (1)
where threshold ∈ (0, 1). We use threshold = 0.7 in this assignment. Here W is actually an
weighted adjacency matrix (in terms of dissimilarity/distance). (1 mark)
• Based on W, obtain the discrete (unweighted) adjacency matrix A which is an a
ay variable unde
class DrugGraph. A value aij in matrix A is calculated as
aij =
{
1 if wij 6=∞
0 otherwise
, (2)
which means that matrix A is a 0-1 matrix. (0.5 mark)
1
3. Define a method named BFS(int i) for
eath-first search on a graph starting from a drug index i.
(1 mark)
4. Define a method named findModules under the DrugGraph class to find the number of disconnected
components (module) in the graph. This method should print the number of vertices in each module
in an sorted (descending) order. Hint: you may create a variable of integer type named module unde
class Vertex to record the module ID for a drug. Within this method, you should repeatedly call the
BFS method to traverse through unvisited vertices. For each BFS call, the traversed drugs are given
the same module ID (e.g., one of 0, 1, 2, ...). Module 0 should be larger then Module 1, and so on.
That is, Module 0 is the largest module, Module 1 the second, XXXXXXXXXXmark)
5. Define a method named keepAModule(int moduleID) that keeps the connected component with a
given module ID and removes all other modules. Note that, all the methods defined below are based
on the kept module rather than the entire graph which is disconnected. (1 mark)
6. Define a wrapper method named findShortestPath(fromDrug, toDrug, method) to find the shortest
path from fromDrug to toDrug on the connected module, where fromDrug and toDrug are the labels of
two drugs. method is either ‘‘unweighted’’ or ‘‘weighted’’. This method should print the sequence
of DrugBankIDs of the shortest path on the screen. To realize this wrapper method, you may need to
define two extra methods to implement the unweighted algorithm (BFS-based algorithm) and weighted
algorithm (Dijkstra’s algorithm) for finding the shortest path starting from a vertex. (3 marks)
7. Define a method named MSTPrim to find the minimum spanning tree on the connected module. In this
method, you should write the list of DrugBank IDs (can the same order as the drugs in your a
ay
attribute) on the MST to a text file named MSTPrimResult.txt with each row for a DrugBank ID and
associated dist value. This function should also print the total length (i.e., the sum of dist values on
the MST) of the MST on screen. (2 marks)
8. In the main function, instance (named dg) of the DrugGraph should be created. The following methods
of this instance should be called sequentially: (0.5 mark)
(a) dg.readData(...)
(b) dg.findModules(...) (Hint: there are 87 modules, the first module has 1801 vertices.)
(c) dg.keepAModule(0) This call is to keep the largest connected component only and remove othe
components.
(d) dg.findShorestPath(’DB01050’, ’DB00316’, ’unweighted’) (The result should be: ’DB01050’
- ’DB08868’ - ’DB00291’ - ’DB00265’ - ’DB00316’)
(e) dg.findShorestPath(’DB01050’, ’DB00316’, ’weighted’) (The shortest path should be ’DB01050’
- ’DB00788’ - ’DB00264’ - ’DB01297’ - ’DB00316’.)
(f) dg.MSTPrim(...) (Hint: the returned total cost should be 625.65.)
9. Your code should be well commented. (0.5 mark)
3 Submission
• Your source code.
• A PDF printout of your source code.
• Output text file MSTPrimResult.txt (from Task 7).
• A text file that contains all the printed out information on the screen/console.
• If any of the above require files are not submitted, 0 mark will be given to the whole assignment.
2
Answered 5 days After Nov 28, 2022

Solution

Vikas answered on Dec 03 2022
38 Votes
Source Code 1
Source Code
import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.List;
import java.util.Queue;
import java.util.Random;
import java.io.*;
import java.util.A
ayList;
import java.util.A
ays;
import java.util.HashMap;
import java.util.HashSet;
import java.util.LinkedList;

**
* This is the class that represents an element of dockedapproved.tab using genericNam
e, SMILES, drugBankID, url,
* drugGroups, score
*
*/

class Vertex {
String genericName;
String SMILES;
String drugBankID;
String url;
String drugGroups;
String score;
boolean visited;
int dist;
String path;

public Vertex(String genericName, String SMILES, String drugBankID, String url, St
ing drugGroups, String score) {
this.genericName = genericName;
this.SMILES = SMILES;
this.drugBankID = drugBankID;
this.url = url;
this.drugGroups = drugGroups;
this.score = score;
this.visited = false;
this.dist = 0;
this.path = "";
}

/**
* This is the Method to display the information of Vertex class
*
* @param void
*
* @return void
*/
Source Code 2

void displayVertex() {
System.out.println(genericName + " " + SMILES + " " + drugBankID + " " + url +
" " + drugGroups + " " + score +
" " + visited + " " + dist + " " + path);
}
}

**
* This is the main class of the program containing main function and all the calculati
ons and outputs are
* generated in this class
*
*/

class DrugGraph {
private HashMap mapWeighted = new HashMap
();
private HashMap mapUnWeighted = new HashMap
();
private Vertex vertices[];
private double tot;
private int size = 1932;
private int moduleToKeep;
private double W[][];
private int A[][];

/**
* This is the readData() method used to read the input from dockedapproved.tab
*
* @param void
*
* @return void
*/

void readData() throws IOException {

read the text file and fill the data a
ay
List list = new A
ayList
();
String line = "";
size = 1932;

try {
BufferedReader
= new BufferedReader(new FileReader("dockedapproved.ta
"));

while((line =
.readLine()) != null) {
String[] values = line.split("\\s+");

int idx = whileAlphabetic(values);

String name = "";
for(int i = 0; i < idx; i++) name += values[i] + " ";

String group = "";
for(int i = idx+3; i < values.length-1; i++) group += values[i] + " ";

String ID = "";
for(int i = 0; i < values.length; i++) {
if(values[i].startsWith("DB")) ID = values[i];
}
Source Code 3

list.add(new Vertex(name, values[idx], ID, values[idx+2], group, value
s[values.length-1]));
}


.close();

vertices = new Vertex[list.size() -1];
for(int i = 1; i < list.size(); i++) {
vertices[i-1] = list.get(i);
}

}
catch(FileNotFoundException e) {
e.printStackTrace();
}

W = new double[1932][1932];
A = new int[1932][1932];

...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here