Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

• Unless otherwise stated, use a 5% level ( = 0.05) in all tests. 2. Using the s

ID: 3052553 • Letter: #

Question

• Unless otherwise stated, use a 5% level ( = 0.05) in all tests.

2. Using the seatpos data, fit the linear regression of hipcenter on all of the other variables. (a) Produce a summary of the regression results. (b) Do any variables appear to be significant based on the individual t-tests for their coefficients? What about based on the overall F-test (for all of the variables together)? (c) Compute the variance inflation factors (VIFs) for the variables. Using the threshold of 10 to determine if a VIF indicates a problem of collinearity, which variables have a VIF indicating a possible problem? (d) Reduce the model by removing all variables that had VIFs you identified as problematic in the previous part. Produce a summary of the regression results. (e) For the model of the previous part, do any variables appear to be significant based on the individual t-tests for their coefficients? What about based on the overall F-test (for all of the variables together)? Compute the VIFs for the reduced set of variables. (Have they changed?) Again using the threshold of 10, which variables have a VIF indicating a possible problem? (f)

Explanation / Answer

# Library in which dataset is stored

library(faraway)

# top six rows of the dataset

head(seatpos)

Age Weight HtShoes    Ht Seated Arm Thigh Leg hipcenter

1 46    180   187.2 184.9   95.2 36.1 45.3 41.3 -206.300

2 31    175   167.5 165.5   83.8 32.9 36.5 35.9 -178.210

3 23    100   153.6 152.2   82.9 26.0 36.6 31.0   -71.673

4 19    185   190.3 187.4   97.3 37.4 44.1 41.0 -257.720

5 23    159   178.0 174.1   93.9 29.5 40.1 36.9 -173.230

6 47    170   178.7 177.0   92.4 36.0 43.2 37.4 -185.150

The dataset contains the following variables

Age - Age in years

Weight - Weight in lbs

HtShoes - Height in shoes in cm

Ht -Height bare foot in cm

Seated -Seated height in cm

Arm - lower arm length in cm

Thigh - Thigh length in cm

Leg - Lower leg length in cm

Hipcenter - horizontal distance of the midpoint of the hips from a fixed location in the car in mm

# we run Regression using lm statement

seatpos_lm = lm(hipcenter ~ Age + Weight + HtShoes + Ht + Seated + Arm + Thigh + Leg, data=seatpos)

a)

#summary of our regression results

summary(seatpos_lm)

Call:

b)

At alpha = 0.05

Since the p-value of none of the variables is less than 0.05, hence individually, no variable is significant.

However, for the entire model, the results are statistically significant based on the overall F-Test statistic. (p-value is less than 0.05).

c)

#VIFs of all the variables

vif(seatpos_lm)

       Age     Weight    HtShoes         Ht     Seated        Arm      Thigh        Leg

1.997931   3.647030 307.429378 333.137832   8.951054   4.496368   2.762886 6.694291

Since the VIF threshold is given to be 10, we observe that VIF Values for both the variables (HtShoes and Ht), VIF values exceeds over 300. Hence there is problem using both these variables in our model. (They might be collinear)

d)

# new Regression model after removing the problematic variables

seatpos_lm1 = lm(hipcenter ~ Age + Weight + Seated + Arm + Thigh + Leg, data=seatpos)

summary(seatpos_lm1)

Call:

e)

Now,

Only the variable Leg appears to be significant.

Overall model is still significant with a slight decrease in the value of R2. Also the p-value is 10 times less than the previous model.

f)

vif(seatpos_lm1)

None of the predictor have exceeded the threshold of 10

Age Weight HtShoes    Ht Seated Arm Thigh Leg hipcenter

1 46    180   187.2 184.9   95.2 36.1 45.3 41.3 -206.300

2 31    175   167.5 165.5   83.8 32.9 36.5 35.9 -178.210

3 23    100   153.6 152.2   82.9 26.0 36.6 31.0   -71.673

4 19    185   190.3 187.4   97.3 37.4 44.1 41.0 -257.720

5 23    159   178.0 174.1   93.9 29.5 40.1 36.9 -173.230

6 47    170   178.7 177.0   92.4 36.0 43.2 37.4 -185.150

The dataset contains the following variables

Age - Age in years

Weight - Weight in lbs

HtShoes - Height in shoes in cm

Ht -Height bare foot in cm

Seated -Seated height in cm

Arm - lower arm length in cm

Thigh - Thigh length in cm

Leg - Lower leg length in cm

Hipcenter - horizontal distance of the midpoint of the hips from a fixed location in the car in mm