Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

2013, Q3 Question 3 (2 + 24 marks) (a) In conducting a principal component analy

ID: 2931883 • Letter: 2

Question

2013, Q3 Question 3 (2 + 24 marks) (a) In conducting a principal component analysis R produces the following summary output. PC1 PC2 PC3 PC4 PC5 PC6 PC7 Standard deviation 3.923 3.069 2.867 2.579 2.038 1.887 1.125 Proportion of Variance 0.316 0.194 0.169 0.137 0.085 0.073 0.026 cumulative Proportion * 0.510 0.679 0.901 0.974 1.000 Compute the missing entries marked *. (b) Compute the binary metric between the following two tweets. Tweet 1 assault assistance disadvantaged university students begins Tweet 2: believe more students doing university better

Explanation / Answer

a)

the first * is proportion of variance at PC1 i.e. 0.316

The second * is [cumulative proportion at PC3 + proportion of variance at PC4] = 0.679 + 0.137 => 0.816

b)

Binary

Metric

The total binary metric between two tweets is 8

Text Tweet1 Tweet2

Binary

Metric

assault 1 0 1 assistance 1 0 1 disadvantage 1 0 1 university 1 1 0 students 1 1 0 begins 1 0 1 believe 0 1 1 more 0 1 1 doing 0 1 1 better 0 1 1