Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

This is a data science quesstion: Below is a decision tree I made for classifyin

ID: 3720479 • Letter: T

Question

This is a data science quesstion: Below is a decision tree I made for classifying tweets based on relevance to the stock market.

I am supposed to describe a few rules from the decision tree and I'm not really sure what rules are. I'm assuming a rule would be like if the string doesn't have the word "stock" in it, then you move to the right. If it does then you move to the left. Can you give some examples of rules from the tree? Thanks!

Also, can someone please explain what the "_____ <= 0.5" means at the top of each node? Ex: trading <= 0.5

stock0.5 entropy 0.982 samples-584 value = [246.338] True trading0.5 entropy 0.969 samples 346 value [209, 137] afinlitfuture 0.5 entropy 0.623 samples = 238 value = [37, 201 ] his

Explanation / Answer

So this is a kind of text mining, where you want to classify the tweets based on stock market relevance.

In the decision tree, we choose an attribute as the basis to divide the total samples into two or more parts. The attribute can be chosen using GINI index or Information gain. Information gain method use entropy. since your tree has a value of entropy on each node so we can say, to construct this tree Information gain method is used.

In text mining, there is famous term i.e. Term frequency which means the number of word occurrence in a document.

Consider a document containing 100 words
wherein the word apple appears 3 times.
Following the previously defined formulas, the
term frequency (TF) for apple is then (3 / 100) =
0.03.

for the above example, your attributes are word and decision are taken on the basis of its term frequency,

so for the statement like stock<0.5, the best I can say after analyzing decision tree, is the condition that will divide the sample dataset into two parts. one when the condition is true(left) and second when the condition is false (right). and this has been done on each node.

hit thumbs up if you like the answer :)

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote