Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

3. Suppose we run an online news aggregation platform which observes a person\'s

ID: 3271814 • Letter: 3

Question

3. Suppose we run an online news aggregation platform which observes a person's habits of reading online news articles in order to recommend other articles the user might want to read. Suppose that we have characterised each online article by whether it mentions celebrity, whether it features sports, whether it features politics, and whether it features technology. Suppose we are given the examples below about whether a specific user reads various online articles. We want to use this data set to train a model that predicts which article this person would like to read based on the mentioned features. (a) Suppose you would like to use decision tree learning for this task. Apply the TDIDT algorithm to learn a decision tree. Stop extending a branch of the decision tree when all remaining training examples have the same target feature. Demonstrate how you compute the information gain for (7 marks) the features at each node. Draw the resulting decision tree. (b) Suppose you would like to use naive Bayesian learning for this task. Apply naive Bayesian learning algorithm to approximate P(Reads true), P(X | Reads-true), and P(X Reads false

Explanation / Answer

To calculate information gain

Entropy=-plog2p-qlog2q

information gain=1-Entropy

Steps to calculate entropy for a split:

Entropy of parent node:--6/13*log2(6/13)-7/13*log2(7/13)=0.995727

Entropy for individual node and information gain

TRUE FALSE Entropy Information Gain Celebrity 0.863121 0.918296 0.888586 0.111414 Sports 1 0.985228 0.992046 0.007954 Politics 0.918296 0.863121 0.888586 0.111414 Technology 0.970951 1 0.988827 0.011173
Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote