Clustering (a) [6] Use one sentence to distinguish each of the following pairs o
ID: 3770449 • Letter: C
Question
Clustering
(a) [6] Use one sentence to distinguish each of the following pairs of methods:
(1) k-means vs. KNN
Answer:
(2) STING vs. CLIQUE
Answer:
(3) BIRCH vs. CHAMELEON.
Answer:
(b) [6] Outline the best clustering method for the following tasks (and briey reason on why you make such a design):
(i) nding oil spills along a coast line
Answer:
(ii) clustering employees in a company based on their salaries and years of working experience
Answer:
(c) [6] why subspace clustering is a good choice for high-dimensional data? Outline one ecient and eective subspace clustering method that can cluster a very high dimensional (e.g., thousands of dimensions) data set.
Answer:
(d) [6] Cross-validation can be useful in both classication and clustering.
What are the dierences in these two cases?
Answer:
Explanation / Answer
1.k-means vs. KNN:
k-means is unsupervised beacuse the points have no external in the classification.
KNN is supervised because it classify the point based on the known classification of the other points.
k-means clustering algorithm.KNN is classification algorithm.
k-means: it is partiton a set of points into k clusters that it points each cluster tend to near each other cluster.
KNN means determine the classification of the point it combines the classification of K nearest point.
2.STING CLIQUE
STING is a explores the statistical information it stored in grid cells.
CLIQUE is grid and density based it approach for subspace clustering in a high dimensional the data space.
3.BIRCH vs. CHAMELEON.
BIRCH algorithm is is best performance in large and small datasets
CHAMELEON has superior performance in small datasets.
6b.Outline the best clustering method is clustering employees
nding oil spills along a coast line is classification type
employees is clustering type.
6c.why subspace clustering is a good choice for high-dimensional data?
Outline one ecient and eective subspace clustering method that can
cluster a very high dimensional (e.g., thousands of dimensions) data set.
In high dimensional data many dimensions are irrelevant.and it can mask existing clusters
in the noisy data so using subspace cluster find clusters from different subspaces within the dataset.
subspace
Using CLIQUE efficiently to find accurate clusters in the large high dimensional data sets.
6d.Cross-validation can be useful in both classication and clustering.
What are the dierences in these two cases?
difference between classication and clustering.
cluster is a group of data based on some variables into number of some groups
classification is a certain groups and different variables are related to that groups.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.