Note: 1. Report the values (output) in the text box when submitting the assignment. 2. Attach one (consolidated) code/notebook which is used to generate the values. Q1. In this problem we will...

1 answer below »

Note:
1. Report the values (output) in the text box when submitting the assignment.
2. Attach one (consolidated) code/notebook which is used to generate the values.
Q1. In this problem we will verify that ensembles indeed perform better than individual classifiers.
Consider K set of binary classifiers. For simplicity we shall simulate the accuracy of each classifier using a
probability value p. To do the ensemble, we repeat the following N times.
· Generate K random binary values in {0, 1} with probability of 1 being p. (Look at the function
numpy.random.choice)
· Take a Majority Vote to predict the class.
Now, accuracy is nothing but the percentage of times (out of N) we predict 1.
REASONING: Essentially, we are assuming that an individual classifier is co
ect with probability p. So, if we generate K random binary values, we randomly guess if each of the K classifier is co
ect/wrong. So, if the majority vote is 1 then we have predicted the co
ect class, else we have predicted the wrong class. Hence accuracy is nothing but percentage of times (out of N) we predict value 1.
Report the accuracy when we substitute the following values:
• p=0.49, K=1000, N=1000
• p=0.51, K=1000, N=1000
• p=0.51, K=10, N=1000
• p=0.51, K=1000, N=10
• p=0.51, K=100, N=10000
Q2. In this problem we shall look at the decision tree and fine tuning it.
Do the following:
· Use the dataset from sklearn. datasets. make_moons. Use the following parameter values random state = 42, n samples = 1000, noise = 0.4.
· Use train test split function with test size = 0.2 and random state = 42 to split the dataset into train and test.
We only change two variables for this problem - max leaf nodes, min samples split. If nothing is specified take the default values. Report the following values
· Accuracy when max leaf nodes = 2
· Accuracy when max leaf nodes = 4
· Accuracy when min samples split = 30
· Using GridSearchCV identify the best hyperparameters within combinations max leaf nodes is in {2, 3, · · ·, 99} and min samples split is in {2, 3, 4}. Report the best parameter combination and accuracy co
esponding to it.
Q3. Use LinearSVC to classify the following data points:
    SNO
    X1
    X2
    Y
    1
    3
    4
    0
    2
    2
    2
    0
    3
    4
    4
    0
    4
    1
    4
    0
    5
    2
    1
    1
    6
    4
    3
    1
    7
    4
    1
    1
Report the following:
• Value of. coef
• Value of. intercept

programmingassignment2-wjq4npwd.docx

Answered 2 days After Nov 15, 2022

Solution

Mohd answered on Nov 17 2022

58 Votes

SOLUTION.PDF

Note: 1. Report the values (output) in the text box when submitting the assignment. 2. Attach one (consolidated) code/notebook which is used to generate the values. Q1. In this problem we will...

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment