Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Clustering Methods (80 points) 1. Write an R function to implement the k-means c

ID: 3598562 • Letter: C

Question

Clustering Methods (80 points) 1. Write an R function to implement the k-means clustering algorithm. Your function should allow users to give a data set, k and initial centers as the input parameters. Your function should output cluster label for each state and the converged centers. (30 points) 2. Write an R function to implement an agglomerative hierarchical clustering algorithm. Your function should allow users to give data set and linkage method (way to calculate distances between clusters) as input arguments. Your function must produce the following outputs: the merging process and the height of the dendrogram corresponds to each merging. (30 points) Hints: you can implement a function that outputs clustering memberships at any stage of merging. Also see below for examples: merge an n 1 by 2 matrix. n is the number of states. Row i of merge describes the merging of clusters at step i of the clustering. If an element j in the row is negative, then observation j was merged at this stage. If j is positive then the merge was with the cluster formed at the (earlier) stage j of the algorithm. Thus negative entries in merge indicate agglomerations of singletons, and positive entries indicate agglomerations of non-singletons. height a set of n1 real values. The clustering height: that is, the value of the criterion associated with the clustering method for the particular agglomeration. Use your functions to cluster cancer patients from 21 cancer types. (20 points) Each cancer data is stored in a separate csv file. Each row corresponds to a patient. The first column means survival time and the second column is an indicator of death versus live. You should use ALL the other columns to do clustering patients with different cancer types, which means you will need to combine all the files into one. BONUS: You may use 21 as a reference number but please find an optimal number of clusters (k) to partition your data and justify your choice. (10 points)

Explanation / Answer

package com;

public class Term {

Double coefficient;

int degree;

public Term() {

this(null, 0);

}

public boolean isPositive() {

return coefficient > 0;

}

public Term(Double coefficient, int degree) {

this.coefficient = coefficient;

this.degree = degree;

}

public Double getCoefficient() {

return coefficient;

}

public void setCoefficient(Double coefficient) {

this.coefficient = coefficient;

}

public int getDegree() {

return degree;

}

public void setDegree(int degree) {

this.degree = degree;

}

public String toString() {

String ans = "";

if (coefficient.doubleValue() == 0)

return "";

if (degree == 0)

return coefficient.toString();

if (coefficient != 1) {

if (coefficient == -1)

ans += "- ";

else

ans += coefficient + " ";

}

ans = ans + "X";

if (degree == 1)

return ans;

return ans + "^" + degree;

}

}

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote