Consider a data set with four binary attributes X1, X2, X3 and X4. The attribute
ID: 3792739 • Letter: C
Question
Consider a data set with four binary attributes X1, X2, X3 and X4. The attribute X4 takes exactly the same value as X3 for each record, i.e. X4 is equal to X3. In each of the following three scenarios, find whether the decision boundary learnt by the two models would be similar, otherwise find which of the two models would perform better. Provide a brief justification for each.
(i) We build two decision trees:
a. T1, that is learnt using all the four attributes
b. T2, that is learnt using only three attributes X1, X2, and X3.
(ii) We build two Naïve Bayes models:
a. B1 that is learnt using all the four attributes.
b. B2 that is learnt using the three attributes X1, X2, and X3.
(iii) We build two KNN models:
a. M1 that is learnt using all the four attributes.
b. M2 that is learnt using the three attributes X1, X2, and X3.
Explanation / Answer
In decision trees case,tree is generated on the basis of given attributes.
The decision boundary learnt by both the models are same. Since the decision tree classifier builds a tree
and on basis of this make predictions. And both performs similar.
In case of Naive Bayes model,it is based on probability.In case of B1 we get more accurate results.
So decision boundary learnt by both models would not be same.And the B1 will perform better than B2.
The KNN model is unsupervised model.It is totally based on input.
The KNN is a K nearest neighbour clustering model. Since both X3 and X4 in case of M1 is clustered into one cluster and X1 and X2 into seperate clusters each.
In case M2 would also be there 3 clusters.
So both M1 and M2 would perform same and both learn similar decision boundaries.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.