Hello, Please give me the complete explanation with full answers as like given i
ID: 664552 • Letter: H
Question
Hello, Please give me the complete explanation with full answers as like given in quetions
Sub: Data Mining
4.) Consider a document-term matrix, where tfij is the frequency of the ith word (term) in the jth document and m is the number of documents. Consider the variable transformation that is defined by
where dfi is the number of documents in which the ith term appears and is known as the document frequency of the term. This transformation is known as the inverse document frequency transformation.
(a) What is the effect of this transformation if a term occurs in one document? In every document?
(b) What might be the purpose of this transformation?
5.) A data analyst wants to use association analysis to analyze test results. The test consists of 1000 questions with four possible answers each.
(a) Can you convert this data into a form suitable for association analysis? If so how?
(b) In particular, what type of attributes would you have and how many of them are there?
6.) Discuss whether or not each of the following activities is a data mining task and why.
(a) Dividing the customers of a company according to their gender.
(b) Dividing the customers of a company according to their profitability.
(c) Computing the total sales of a company.
(d) Sorting a student database based on student identification numbers.
(e) Predicting the outcomes of tossing a (fair) pair of dice.
(f) Predicting the future stock price of a company using historical records.
(g) Monitoring the heart rate of a patient for abnormalities.
(h) Monitoring seismic waves for earthquake activities.
(i) Extracting the frequencies of a sound wave.
Explanation / Answer
This question has lot of sub parts. I have answered the biggest question. Please post one more question.
6)
a)No. This is a simple database query.
b)No. This is an accounting calculation, followed by the application of a threshold. However, predicting the profitability of a new customer would be data mining.
c)No. Again, this is simple accounting.
d)No. Again, this is a simple database query.
e)No. Since the die is fair, this is a probability calculation. If the die were not fair, and we needed to estimate the probabilities of each outcome from the data, then this is more like the problems considered by data mining. However, in this specific case, solutions
to this problem were developed by mathematicians a long time ago, and thus, we wouldn’t consider it to be data mining.
f) ANS: Yes. We would attempt to create a model that can predict the
continuous value of the stock price. This is an example of the area of data mining known as predictive modeling. We could use regression for this modeling, although researchers in many fields have developed a wide variety of techniques for predicting time series.
g) No. this is a simple database query
h) No. this is a simple database query
i) No. this is a simple database query
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.