Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Is there anyone that could help me with this R-related question? --- In R, look

ID: 3365439 • Letter: I

Question

Is there anyone that could help me with this R-related question?

---

In R, look at the cars data–i.e. the variable cars with two columns cars$speed and cars$dist–included in the standard distribution. Fit a 4th order polynomial regression of the form

Do any of the regressors have significant t-statistics? Does the regression have a significant F-statistic? Use the command step( model ) to run a backward variable selection to minimize AIC. Which terms remain in the model? What is the final AIC value? Now run a forward selection. Is the model resulting model the same as in the backwards case?

---

Where AIC is the Akaike Information Criterion. The R code for the model is just lm(distspeed+I(speed^2)+I(speed^3)+I(speed^4),data=cars) , and the cars data set is standard inside R. Can anyone help confirm the analysis? Thanks in advance.

(dist)-A + -A i (Speed ) '

Explanation / Answer

Please see the R code

data("cars")
cars

## fit the model

fit <- lm(dist~ speed+I(speed^2)+I(speed^3)+I(speed^4),data=cars)

## use the summary funtion to see the results

summary(fit)

The results are

summary(fit)

Call:
lm(formula = dist ~ speed + I(speed^2) + I(speed^3) + I(speed^4),
data = cars)

Residuals:
Min 1Q Median 3Q Max
-23.701 -8.766 -2.861 7.158 42.186

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 45.845412 60.849115 0.753 0.455
speed -18.962244 22.296088 -0.850 0.400
I(speed^2) 2.892190 2.719103 1.064 0.293
I(speed^3) -0.151951 0.134225 -1.132 0.264
I(speed^4) 0.002799 0.002308 1.213 0.232

as none of the p value is less than 0.05 , hence no variable is statistically signficant

Residual standard error: 15.13 on 45 degrees of freedom
Multiple R-squared: 0.6835,   Adjusted R-squared: 0.6554
F-statistic: 24.3 on 4 and 45 DF, p-value: 9.375e-11 , yes this is the p value for the f stat , as the p value is less than 0.05 , hence we can say that the model as a whole is statistically significant . This is a typical case of multicollinearity problem , where none of the variables are signficant but model as a whole is signficant

## perform the backward stepwise regression
step(fit,direction="backward")

the result is

Start: AIC=276.38
dist ~ speed + I(speed^2) + I(speed^3) + I(speed^4)

Df Sum of Sq RSS AIC
- speed 1 165.52 10463 275.18
- I(speed^2) 1 258.90 10557 275.62
- I(speed^3) 1 293.28 10591 275.79
- I(speed^4) 1 336.55 10634 275.99
<none> 10298 276.38

Step: AIC=275.18
dist ~ I(speed^2) + I(speed^3) + I(speed^4)

Df Sum of Sq RSS AIC
- I(speed^4) 1 402.20 10866 275.07
- I(speed^3) 1 407.78 10871 275.09
<none> 10463 275.18
- I(speed^2) 1 650.31 11114 276.19

Step: AIC=275.07
dist ~ I(speed^2) + I(speed^3)

Df Sum of Sq RSS AIC
- I(speed^3) 1 5.60 10871 273.09
<none> 10866 275.07
- I(speed^2) 1 609.51 11475 275.80

Step: AIC=273.09
dist ~ I(speed^2)

Df Sum of Sq RSS AIC
<none> 10871 273.09
- I(speed^2) 1 21668 32539 325.91

Call:
lm(formula = dist ~ I(speed^2), data = cars)

Coefficients:
(Intercept) I(speed^2)
8.860 0.129

the final AIC value is highlighted , only speed^2 remains in the final step

> step(fit,direction="forward")
Start: AIC=276.38
dist ~ speed + I(speed^2) + I(speed^3) + I(speed^4)


Call:
lm(formula = dist ~ speed + I(speed^2) + I(speed^3) + I(speed^4),
data = cars)

Coefficients:
(Intercept) speed I(speed^2) I(speed^3) I(speed^4)
45.845412 -18.962244 2.892190 -0.151951 0.002799

Hence the model results are different

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote