Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

# 2) Suppose a multivariate data set has sample covariance matrix S = [36 5 5 4]

ID: 3075139 • Letter: #

Question

# 2) Suppose a multivariate data set has sample covariance matrix S =

[36 5

5 4]

a) Determine both principal components for such a data set, using a PCA based on S.

   Remember that a principal component is a linear combination of the original variables,

   so your answer should be in the form of linear combinations of X1 and X2.

b) Determine the correlation matrix R that corresponds to the covariance matrix S.

c) Determine both principal components for such a data set, using a PCA based on R.

Are the PCA results different from those in part (a)? If so, try to explain why they are different.

HELPFUL NOTE 2: We can do a PCA where the correlation matrix is input rather than the

data matrix using code such as:

my.pc <- princomp(covmat=my.R); summary(my.pc, loadings=T)

where my.R is the correlation matrix.

Explanation / Answer

> S = cbind( c(5 ,0,0), c(0,9,0), c(0,0,9))
> S
     [,1] [,2] [,3]
[1,]    5    0    0
[2,]    0    9    0
[3,]    0    0    9
> eigen (S)
eigen() decomposition
$values
[1] 9 9 5

$vectors
     [,1] [,2] [,3]
[1,]    0    0    1
[2,]    0    1    0
[3,]    1    0    0

b)

pcaData <- prcomp(S , center = FALSE, scale.= FALSE)
> pcaData
Standard deviations (1, .., p=3):
[1] 6.363961 6.363961 3.535534

Rotation (n x k) = (3 x 3):
     PC1 PC2 PC3
[1,]   0   0   1
[2,]   1   0   0
[3,]   0   1   0