Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

Matlab Project Computer Software for Sciences COSC2836 XXXXXXXXXXMarch 2021 The goal of this project is: 1. learning a clustering algorithm called K-means clustering 2. Implement this algorithm using...

1 answer below »
Matlab Project
Computer Software for Sciences COSC2836 XXXXXXXXXXMarch 2021
The goal of this project is:
1. learning a clustering algorithm called K-means clustering
2. Implement this algorithm using Matlab
3. Use our implemented code to cluster sample data
4. Change different parameters of our code and experiment effect of each parameter in the
final clustering result
5. Write a report to describe how the code was implemented, how the experiments were
performed and what were the results
Description
In this part, we learn what clustering is and what K-means clustering is. There are thousands of
esources that describe this algorithm on the internet. Please make sure that you search for more
descriptions if the following definition was not enough for you to understand this algorithm.
Clustering is the task of dividing the population, a set of objects or data points into a number of
groups such that data points in the same groups (called a cluster ) are more similar to other data
points in the same group than those in other groups. In simple words, the aim is to segregate groups
with similar traits and assign them into clusters.
Clustering itself is not one specific algorithm but the general task to be solved. It can be achieved by
various algorithms that differ significantly in their understanding of what constitutes a cluster and
how to find them efficiently. Every methodology follows a different set of rules for defining the
‘similarity’ among data points. There are more than XXXXXXXXXXclustering algorithms known. But few of the
algorithms are used popularly.
One type of clustering models are iterative algorithms in which the notion of similarity is derived by
the closeness of a data point to the centroid of the clusters. K-Means clustering algorithm is a
popular algorithm that falls into this category. In these models, the number of clusters required has to
e mentioned beforehand, making it essential to have prior knowledge of the dataset.
K-means is an iterative clustering algorithm that aims clustering aims to group n data points (x 1 , x 2 ,
..., x n ) into k (≤ n) clusters S = {S 1 , S 2 , ..., S k } in which each data point belongs to the cluster with the
nearest mean . This algorithm works in these five steps :
https:
en.wikipedia.org/wiki/Algorithm
https:
en.wikipedia.org/wiki/Cluster_(statistics)
https:
en.wikipedia.org/wiki/Mean
1. Specify the desired number of clusters K: Let us choose k=2 for 5 data points in 2-D space.
2. Randomly assign each data point to a cluster (Initialization step) : Let’s assign three
points in cluster 1 shown using red colour and two points in cluster 2 shown using grey
colour.
3. Compute cluster centroids: The centroid of data points in the red cluster is shown using the
ed cross and those in the grey cluster using the grey cross.
4. Re-assign each point to the closest cluster centroid(assignment step) : Note that only
the data point at the bottom is assigned to the red cluster even though it’s closer to the
centroid of the grey cluster. Thus, we assign that data point into the grey cluster.
5. Re-compute cluster centroids(Update step) : Now, re-computing the centroids for both
clusters.
6. Repeat steps 4 and 5 until no improvements are possible: Similarly, we’ll repeat the 4th and
5th steps until we’ll reach global optima when there is no further switching of data points
etween two clusters for two successive repeats. It will mark the termination of the algorithm
if not explicitly mentioned.
The K-means algorithm’s objective is to find a set of clusters S = {S 1 , S 2 , ..., S k } for data points (x 1 ,
x 2 , ..., x n ) that minimizes the within-cluster variances. The following equation formulates the XXXXXXXXXX
mentioned objective:
where μ i is the mean of data points in the cluster S i .
1 I t finds local minima
Answered 3 days After Mar 31, 2021

Solution

Kshitij answered on Apr 03 2021
149 Votes
knns/computeCentroids.m
function centroids = computeCentroids(X, idx, K)
[m n] = size(X);
centroids = zeros(K, n);

for i=1:K
xi = X(idx==i,:);
ck = size(xi,1);
% centroids(i, :) = (1/ck) * sum(xi);
centroids(i, :) = (1/ck) * [sum(xi(:,1)) sum(xi(:,2))];
end
end
knns/getClosestCentroids.m
function indices = getClosestCentroids(X, centroids)
K = size(centroids, 1);
indices = zeros(size(X,1), 1);
m = size(X,1);
for i=1:m
k = 1;
min_dist = sum((X(i,:) - centroids(1,:)) .^ 2);
for j=2:K
dist = sum((X(i,:) - centroids(j,:)) .^ 2);
if(dist < min_dist)
min_dist = dist;
k = j;
end
end
indices(i) = k;
end
end
knns/initCentroids.m
function centroids = initCentroids(X, K)
centroids = zeros(K,size(X,2));
randidx = randperm(size(X,1));
centroids = X(randidx(1:K), :);
end
knns
eportKNN.docx
1- Implement above mentioned K-means algorithm using MATLAB(it is highly recommended that you write a separate function for each step,
1. The inputs of your code will be:
a. The number of maximum allowed iterations
. K (number of clusters)
2. Use the Fogy method in the initialization step
3. Use Squared Euclidean distance in the assignment step to finding the nearest mean to a data point.
4....
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here