Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Clustering (a) [6] Use one sentence to distinguish each of the following pairs o

ID: 3770449 • Letter: C

Question

Clustering

(a) [6] Use one sentence to distinguish each of the following pairs of methods:

(1) k-means vs. KNN

Answer:

(2) STING vs. CLIQUE

Answer:

(3) BIRCH vs. CHAMELEON.

Answer:   

(b) [6] Outline the best clustering method for the following tasks (and briey reason on why you make such a design):

(i) nding oil spills along a coast line

Answer:

(ii) clustering employees in a company based on their salaries and years of working experience

Answer:     

  

                                                                                            

(c) [6] why subspace clustering is a good choice for high-dimensional data? Outline one ecient and eective subspace clustering method that can cluster a very high dimensional (e.g., thousands of dimensions) data set.

Answer:

(d) [6] Cross-validation can be useful in both classication and clustering.

What are the dierences in these two cases?

Answer:

Explanation / Answer

1.k-means vs. KNN:

k-means is unsupervised beacuse the points have no external in the classification.
KNN is supervised because it classify the point based on the known classification of the other points.
k-means clustering algorithm.KNN is classification algorithm.
k-means: it is partiton a set of points into k clusters that it points each cluster tend to near each other cluster.
KNN means determine the classification of the point it combines the classification of K nearest point.

2.STING CLIQUE

STING is a explores the statistical information it stored in grid cells.
CLIQUE is grid and density based it approach for subspace clustering in a high dimensional the data space.

3.BIRCH vs. CHAMELEON.

BIRCH algorithm is is best performance in large and small datasets
CHAMELEON has superior performance in small datasets.

6b.Outline the best clustering method is clustering employees
nding oil spills along a coast line is classification type
employees is clustering type.

6c.why subspace clustering is a good choice for high-dimensional data?
Outline one ecient and eective subspace clustering method that can
cluster a very high dimensional (e.g., thousands of dimensions) data set.

In high dimensional data many dimensions are irrelevant.and it can mask existing clusters
in the noisy data so using subspace cluster find clusters from different subspaces within the dataset.
subspace
Using CLIQUE efficiently to find accurate clusters in the large high dimensional data sets.


6d.Cross-validation can be useful in both classication and clustering.

What are the dierences in these two cases?

difference between classication and clustering.

cluster is a group of data based on some variables into number of some groups
classification is a certain groups and different variables are related to that groups.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote