Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

2. For question 2, you will use a subset of the mtcars data. (Run \"?mtcars\" fo

ID: 3044074 • Letter: 2

Question

2. For question 2, you will use a subset of the mtcars data. (Run "?mtcars" for detailed information about the data.) Run the following R codes and use the mtcars2 dataset to answer the questions (a)-(e). Please include your R codes and output for all the following questions. R codes: set . seed (100) sub-index = sample (nrow(mtcars) ,20, replace-FALSE) mtcars2 = mtcars [sub-index.] Now, the mtcars2 dataset contains 11 attributes for 20 automobiles. Our interest is to construct a multiple linear regression model that predicts the fuel consumption (Y-mpg) from number of cylinders (X1 = cyl) and horsepower (X2 - hp) (a) Fit the regression model: Yi-Ao +Aylzil hpXi2 +6, what percentage of the variation in fuel consumption is explained by your fitted model? (1 pt) (b) Using the fitted model, predict the fuel consumption of a six-cylinder car with horsepower 150.(1 pt) (c) Test Ho : Bn,-0 vs Ho : Bnp0 at -0.05. (1 pt) (d) Fit another regression model: -Bo hp1e using the hp (horsepower) predictor only. Test (e) Were your conclusions from (c) and (d) consistent? If not, how can the contradictory results be explained? (1 pt)

Explanation / Answer

The R codes and the outputs for parts (a) to (d) is provided below. I will be writing the answers to the questions after these. I have included comment lines to help you understand which part of the R code goes with which part question.

> # SETTING UP THE DATASET
> set.seed(100)
> sub_index=sample(nrow(mtcars),20,replace=F)
> mtcars2=mtcars[sub_index,]
>
> # PART A
> mod1 = lm(mpg ~ cyl + hp,mtcars2)
> summary(mod1)

Call:
lm(formula = mpg ~ cyl + hp, data = mtcars2)

Residuals:
Min 1Q Median 3Q Max
-5.1775 -1.8221 0.3529 0.9954 6.8980

Coefficients:
Estimate Std. Error t value Pr(>|t|)   
(Intercept) 39.009142 2.541303 15.350 2.15e-11 ***
cyl -2.845021 0.652236 -4.362 0.000424 ***
hp -0.009647 0.016575 -0.582 0.568181   
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.048 on 17 degrees of freedom
Multiple R-squared: 0.8019, Adjusted R-squared: 0.7786
F-statistic: 34.41 on 2 and 17 DF, p-value: 1.056e-06

>
> # PART B
> coef(mod1)[1] + (6 * coef(mod1)[2]) + (150 * coef(mod1)[3])
(Intercept)
20.49195
>
> # PART C
> summary(mod1)

Call:
lm(formula = mpg ~ cyl + hp, data = mtcars2)

Residuals:
Min 1Q Median 3Q Max
-5.1775 -1.8221 0.3529 0.9954 6.8980

Coefficients:
Estimate Std. Error t value Pr(>|t|)   
(Intercept) 39.009142 2.541303 15.350 2.15e-11 ***
cyl -2.845021 0.652236 -4.362 0.000424 ***
hp -0.009647 0.016575 -0.582 0.568181   
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 3.048 on 17 degrees of freedom
Multiple R-squared: 0.8019, Adjusted R-squared: 0.7786
F-statistic: 34.41 on 2 and 17 DF, p-value: 1.056e-06

>
> # PART D
> mod2 = lm(mpg ~ hp,mtcars2)
> summary(mod2)

Call:
lm(formula = mpg ~ hp, data = mtcars2)

Residuals:
Min 1Q Median 3Q Max
-5.677 -2.506 -1.202 1.828 8.258

Coefficients:
Estimate Std. Error t value Pr(>|t|)   
(Intercept) 30.08342 2.13199 14.110 3.57e-11 ***
hp -0.06832 0.01370 -4.988 9.54e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 4.313 on 18 degrees of freedom
Multiple R-squared: 0.5802, Adjusted R-squared: 0.5569
F-statistic: 24.88 on 1 and 18 DF, p-value: 9.539e-05


(a) From the output of PART A, we see that the regression model is: mpg = 39.009142 - (2.845021 * cyl) - (0.009647 * hp).
Here, R-squared = 0.7786. Hence, 77.86% of the variation in fuel consumption is explained by our fitted model.

(b) From the output of PART B, we see that the predicted fuel consumption is 20.49 miles/gallon.

(c) From the output of PART C, we see that the p-value corresponding to the variable "hp" is greater than 0.05, which means that we fail to reject H0. Hence, we can say that the "hp" variable is insignificant in our model, at 5% level of significance.

(d) From the output of PART D, we see that the regression model is: mpg = 30.08342 - (0.06832 * hp). We see that the p-value corresponding to the variable "hp" is less than 0.05, which means that we reject H0. We can say that the "hp" variable is significant in our model, at 5% level of significance.

(e) The conclusions from parts (c) and (d) were not consistent. The results are consistent due to the fact that when the variables "hp" and "cyl" are added in the first model, the "cyl" variable could better estimate the value of the dependent variable, that is, mpg. Hence it became significant, while the other variable "hp" became insignificant.
In the second model, "hp" was the only variable left in the model to estimate "mpg". Hence, it became significant.

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote