EECE5644_2020Spring_TakeHome4Questions.pdf
EECE5644 Spring 2020 – Take Home Exam 4
Submit: Monday, 2020-April-13 before 11:45ET
Please submit your solutions on Blackboard in a single PDF file that includes all math and
numerical results. Only the contents of this PDF will be graded.
For control purposes, please also provide any code you write to solve the questions in one of
the following ways:
1. Include all of your code in the PDF as an appendix.
2. In the PDF provide a link to your online code repo where the code resides.
This is a graded assignment and the entirety of your submission must contain only your own
work. Do not make your code repo public until after the submission deadline. If people find you
code and use it to solve questions, we will consider it cheating for all parties involved. You may
enefit from publicly available literature including software (that is not from classmates), as long
as these sources are properly acknowledged in your submission.
Copying math or code from each other is clearly not allowed and will be considered as aca-
demic dishonesty. Discussing with the instructor and the teaching assistant to get clarification, o
to eliminate doubts is acceptable.
By submitting a PDF file in response to this take home assignment you are declaring that the
contents of your submission, and the associated code is your own work, except as noted in you
acknowledgement citations to resources.
Start early and use the office periods well. The office periods are on gcal and contain video
links for remote participation. You may call XXXXXXXXXXto reach Deniz if you have difficulty
using the video links for the office periods.
It is your responsibility to provide the necessary and sufficient evidence in the form of expla-
nations, math, visualizations, and numerical values to earn scores. Do not provide a minimal set
of what you interpret is being precisely requested; you may be providing too little evidence of
your understanding. On the other hand, also please do not dump in unnecessarily large amounts
of things; you may be creating the perception that you do not know what you are doing. Be inten-
tional in what you present and how you present it. You are trying to convince the reader (grader)
that your mathematical design, code implementations, and generated results are legitimate and
co
ect/accurate.
1
Question 1 (30%)
The data generation script for this question is called exam4q1 generateData.m. Generate two-
dimensional x = [x1,x2]T samples with this Matlab script; specifically, 1000 samples for training
and 10000 samples for testing.
Train and test a single hidden layer MLP function approximator to estimate the value of X2
from the value of X1 by minimizing the mean-squared-e
or (MSE) on the training set. (The first
coordinate of each data vectors will go into the MLP, and the output will try to approximate the
second coordinate of the data vectors.)
Use a softplus (SmoothReLu) activation function as the nonlinearity for the perceptrons in
the hidden layer. Useing 10-fold cross validation, select the the best number of perceptrons that
your training set can justify using. Leave the output layer of the MLP as a linear unit linea
(no nonlinearity). Once the best model structure is identified using cross-validation, train the
selected model with the entire training set. Apply the trained MLP to the test set. Estimate the test
performance.
You may use existing software packages for all aspects of this solution. Make sure to clearly
demonstrate that you are using the packages properly.
Hint: logistic(z) = 1/(1+ e−z) & so f t plus(z) = ln(1+ ez)
Note: The theoretical minimum-MSE estimator is the conditional expectation of X2 given X1,
which can be analytically derived using the joint pdf, if it had been known, but in practice we do
not know the true pdf, so in this exercise we try to approximate the theoretically optimal solution
with a neural network model.
Question 2 (35%)
For this question use the same multiring data generation script from the third assignment. Gen-
erate a two-class training set with 1000 and testing set with 10000 samples. The class priors should
e equal. Train and evaluate a support vector machine classifier with a Gaussian kernel (radial-
asis function (RBF) kernel) on these datasets.
Specifically, us a spherically symmetric Gaussian/RBF kernel. Using 10-fold cross-validation,
select the best box constraint hyperparameter C and the Gaussian kernel width parameter σ (nota-
tion based on previously covered math in class).
Train a final SVM using the best combination of hyperparameters with the entire training set.
Classify the testing dataset samples with this trained SVM to assess performance.
You may use existing software packages for all aspects of this solution. Make sure to clearly
demonstrate that you are using the packages properly.
Question 3 (35%)
In this question, you will use GMM-based clustering to segment the color images 3096 color.jpg
and 42049 color.jpg from the Berkeley Image Segmentation Dataset. We will refer to these images
as the airplane and bird images, respectively.
As preprocessing, for each pixel, generate a 5-dimensional feature vector as follows: (1) ap-
pend row index, column index, red value, green value, blue value for each pixel into a raw feature
vector; (2) normalize each feature entry individually to the interval [0,1], so that all of the fea-
ture vectors representing every pixel in an image fit into the 5-dimensional unit-hypercube. All
segmentation algorithms should operate on these normalized feature vectors.
1
For each image do the following: (1) Using maximum likelihood parameter estimation, fit a
GMM with 2-components, use this GMM to segment the image into two parts; (2) Using 10-fold
cross-validation, and maximum everage validation-log-likelihood as the objective, identify the best
number of clusters, then fit a new GMM with this best number of components and use this GMM
to segment the image into as many parts as there are number of Gaussians.
For GMM-based clustering, use the GMM components as class/cluster-conditional pdfs and
assign cluster labels using the MAP-classification rule.
2
MATLAB code for reference/ClusteringWithMeanShiftThenSpectral.m
function [Y,C,labels] = ClusteringWithMeanShiftThenSpectral(N,M)
% Sample call:
% [Y,C,labels] = ClusteringWithMeanShiftThenSpectral(100,4);
% This demo generates N random 2-dim samples from
% an order M mixture of Gaussians with a
itrary
% weights-means-covariances. Uses a fixed-width
% spherically symmetric Gaussian kernel
% based KDE to improve cluster spread using
% mean shift, then uses a basic spectral
% clustering method to assign cluster labels
% selecting number of clusters automatically
% based on eigenvalue sequence in latter.
% Generate iid samples from a random M-GMM
[Y,C] = randGMM(N,M); % Y data, C component labels
% Perform a fixed number of mean shift iterations
T = 6; % Number of mean-shift iterations to be performed
X = meanshift(Y,T);
% Perform basic spectral clustering on X
labels = spectralclustering(X);
nClusters = length(unique(labels));
figure(1), clf,
for k = 1:nClusters
plot(Y(1,find(labels==k)),Y(2,find(labels==k)),'.','Color',hsv2rgb([rand,1,1]));
title(strcat({'Number of Clusters = '},num2str(nClusters))),
axis equal, hold on, drawnow,
end,
%%%%%%
function labels = spectralclustering(Y)
[D,N] = size(Y);
covY = cov(Y');
sigma2 = (1/D)*trace(covY)*(4/((2*D+1)*N))^(2/(D+4));
pd = pdist2(Y',Y'); S = exp(-0.5*pd.^2/sigma2);
%Deg = diag(sum(S,2));%^(-1/2);
%Lsym = Deg^(-1/2)*(Deg-S)*Deg^(-1/2); %Graph Laplacian
%[V,D] = eig(eye(N)-Lsym); [d,ind]=sort(diag(D),'descend');
[V,D] = eig(S); [d,ind]=sort(diag(D),'descend');
D = diag(d); V = V(:,ind);
%figure(2), clf, stem(d,'r.'), hold on, stem(diff(d),'b.'); drawnow,
dGYY = sort(abs(eig(S)),'descend');
nClusters = min(find(abs(cumsum(dGYY)>=0.99*N))); % the sume of evals for GYY will be N
% the smallest number of clusters that total almost N in e.val.
%[~,nClusters]=min(diff(d)); % crude method for deciding on number of clusters
X = V(:,1:nClusters)'; % map data to relevant subspace of eigenfunction injection
% the mapped data X will be clustered by angular separation
[~,labels] = max(abs(X),[],1); % crude final label assignment method
%figure(3), clf,
%if nClusters == 2
% plot(X(1,find(labels==1)),X(2,find(labels==1)),'r.'); hold on,
% plot(X(1,find(labels==2)),X(2,find(labels==2)),'g.');
% drawnow,
%elseif nClusters == 3
% plot3(X(1,find(labels==1)),X(2,find(labels==1)),X(3,find(labels==1)),'r.'); hold on,
% plot3(X(1,find(labels==2)),X(2,find(labels==2)),X(3,find(labels==2)),'g.');
% plot3(X(1,find(labels==3)),X(2,find(labels==3)),X(3,find(labels==3)),'b.');
% drawnow,
%end,
%%%%%%
function X = meanshift(Y,T)
[D,N] = size(Y); covY = cov(Y');
sigma2 = (1/D)*trace(covY)*(4/((2*D+1)*N))^(2/(D+4));
% sigma2*identity is the covariance used for KDE Gaussians
X = Y; % initialize mean shift iterations to data points
for t = 1:T % you can substitute this with a better stopping criterion
figure(1), clf, plot(Y(1,:),Y(2,:),'r.');drawnow,
axis equal, hold on, plot(X(1,:),X(2,:),'b.'),
title(strcat({'Iteration '},num2str(t),{' of '},num2str(T)));
%pause(3),
pd = pdist2(X',Y');
GXY = exp(-0.5*pd.^2/sigma2)/((2*pi)^(D/2)*sqrt(sigma2)^D);
W = inv(diag(sum(GXY,2)))*GXY;
X = Y*W';
end
figure(1), clf, plot(Y(1,:),Y(2,:),'r.');
axis equal, hold on, plot(X(1,:),X(2,:),'b.'),
title(strcat({'Iteration '},num2str(t),{' of '},num2str(T))),
%keyboard,
%%%%%%
function [Y,C] = randGMM(N,M)
% Generates N 2-dim samples
if 0 % random GMM parameters
temp = rand(1,M); a = temp/sum(temp); % probability for each Gaussian component
mu = 10*(rand(2,M)-0.5); % means for each Gaussian component
S = randn(2,2,M); % scale matrices for each Gaussian component
end
if 1 % fixed GMM parameters
a = ones(1,M)/M;
mu = [linspace(-M,M,M);zeros(1,M)];
tempS = 0.25*M/(M-1)*eye(2,2);
S = repmat(tempS,1,1,M);
end
Z = randn(2,N); % 0-mean I-scale Gaussian samples
Nc = zeros(1,M); temp = rand(1,N); ca = [0,cumsum(a)];
for m = 1:M,
Nc(1,m) = length(find(ca(m)<=temp & temp
end
cNc = [0,cumsum(Nc)]; C = zeros(1,N); Y = zeros(2,N);
for m = 1:M,
C(cNc(m)+1:cNc(m+1)) = m*ones(1,Nc(m));
Y(:,cNc(m)+1:cNc(m+1)) = S(:,:,m)*Z(:,cNc(m)+1:cNc(m+1))+mu(:,m)*ones(1,Nc(m));
end
MATLAB code for reference/EMforGMM.m
function EMforGMM(N)
% Generates N samples from a specified GMM,
% then uses EM algorithm to estimate the parameters
% of a GMM that has the same nu,mber of components
% as the true GMM that generates the samples.
close all,
delta = 1e-2; % tolerance for EM stopping criterion
egWeight = 1e-10; % regularization parameter for covariance estimates
% Generate samples from a 3-component GMM
alpha_true = [0.2,0.3,0.5];
mu_true = [ XXXXXXXXXX;0 0 0];
Sigma_true(:,:,1) = [3 1;1 20];
Sigma_true(:,:,2) =