Clusters of documents can be summarized by finding the top terms (words) for the
ID: 3175688 • Letter: C
Question
Clusters of documents can be summarized by finding the top terms (words) for the documents in the cluster, e.g., by taking the most frequent k terms, where k is constant, say 10, or by taking all terms that occur more frequently than a specified threshold. Suppose that K-means is used to find clusters of both documents and words for a document data set.
How might a set of term clusters defined by the terms in a document cluster differ from the word clusters found by clustering the terms with K-means?
How could term clustering be used to define clusters of documents?
Cite the sources you use to make your response.
Explanation / Answer
“A set of term clusters classified by the ways in a document cluster differ from the word clusters found by clustering the terms with K-means”
Cluster can also denote to a group of devices that work jointly that perform a parallel function. Different grid computing, a processor cluster is controlled through a particular software program that supervises all the computers and also "nodes" inside the cluster. The nodes work jointly to full a particular task. Document clustering is the software application of cluster study to textual papers. It has software applications in routine document association, topic extraction as well as quick information retrieval and also filtering. Mainly Document clustering engages the use of descriptors as well as descriptor origin. The Descriptors are collections of words that explain the contents inside the cluster. Mainly the Document clustering is usually measured to be a centralized method. Instances of document clustering contain web document clustering for exploring users.
Word-clusters that capture nearly all of the common information regarding the set of papers, and then find paper clusters, that safeguard the information regarding the word clusters. (Slonim, 2000)
Clustering be applied to classify clusters of documents for the reason that Clustering is applied for parallel processing, and also load balancing as well as, fault tolerance.
Mainly Clustering is an admired approach for implementing equivalent processing applications for the reason that it allows companies to influence the savings already made in computers and workstations. Moreover, it's relatively simple to add new CPUs basically by adding a latest PC to the network.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.