This is for a Python data mining class. Decision trees splitting rules: entropy
ID: 3872756 • Letter: T
Question
This is for a Python data mining class.
Decision trees splitting rules: entropy or Gini. Using Gini impurity measure and entropy assess the two possible splits below, and identify the split that leads to higher reduction in impurity or higher information gain. The data is for 15 cases in Titanic dataset. D is for 'Died', S is for 'Survived'. Split 1: Age>42 Split 2: Ticket>18 S D S D SS D D D S WW D D SD D S DS D SS D D Ds D DS D D D S D SS D D D D DS D D SS WVW WAW DS DS D D S DS D DSS D DExplanation / Answer
Part a)
Gini Index for a given node t :
Part b)
Entropy for a given node t :
Split 1 #SURVIVED #DEAD GINI ROOT 6 9 1-(6/15)^2-(9/15)^2 = 0.48 LEFTNODE 3 7 1-(3/10)^2-(7/10)^2 = 0.42 RIGHTNODE 3 2 1-(3/5)^2-(2/5)^2 = 0.48Related Questions
Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.