Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

URL for data: https://raw.githubusercontent.com/gweon/stat420/master/hw6-data-1.

ID: 3062062 • Letter: U

Question

URL for data: https://raw.githubusercontent.com/gweon/stat420/master/hw6-data-1.csv

This is R-program question!

1. For question 1, import the data in R from https://raw.githubusercontent.com/gweon/stat420/master/hw6-data-1.csv The data set contains 100 observations with 4 variables: y (response), a1, r2 and x3. Consider the MLR (a) Obtain the fitted model. Make a prediction at x1 = 3, 2 = 70 and x3 = 10. (1 pt) (e) Check the equal variance and the normality assumptions using appropriate statistical tests (=0.05). (1 pt) (f) Was the three-way interaction term needed? Why/why not? (1 pt) (g) Test the null hypothesis: -As = -Pr = 0 at = 0.05. (1 pt)

Explanation / Answer

rm(list=ls())

install.packages("olsrr")

library(olsrr)

## Import Data

data<-read.csv("https://raw.githubusercontent.com/gweon/stat420/master/hw6-data-1.csv")

# Create new variable

data$x4<-data$x1*data$x2

data$x5<-data$x1*data$x3

data$x6<-data$x2*data$x3

data$x7<-data$x1*data$x2*data$x3

################ (a) Fitted equation and its prediction for given values of x1, x2, x3

### Fitting MLR

fit<-lm(y~.,data=data)

summary(fit)

# All estimates and summary of fitted regression

# Call:

# lm(formula = y ~ ., data = data)

#

# Residuals:

# Min 1Q Median 3Q Max

# -6.034 -2.224 -0.081 2.121 7.264

#

# Coefficients:

# Estimate Std. Error t value Pr(>|t|)

# (Intercept) 7.327393 3.559242 2.059 0.0424 *

# x1 1.709184 1.251519 1.366 0.1754

# x2 -0.166497 0.059186 -2.813 0.0060 **

# x3 0.561826 0.312254 1.799 0.0753 .

# x4 0.038134 0.020579 1.853 0.0671 .

# x5 0.121700 0.110824 1.098 0.2750

# x6 -0.003239 0.005007 -0.647 0.5193

# x7 -0.001350 0.001735 -0.778 0.4385

# ---

# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

#

# Residual standard error: 3.336 on 92 degrees of freedom

# Multiple R-squared: 0.8574, Adjusted R-squared: 0.8466

# F-statistic: 79.04 on 7 and 92 DF, p-value: < 2.2e-16

### Prediction for given values of x1=3, x2=70 and x3=10

pred<-predict.lm(fit,newdata=data.frame("x1"=3,"x2"=70,"x3"=10,"x4"=210,"x5"=30,"x6"=700,"x7"=2100))

# pred= 12.97488

################ (b) Normal;ity assumption testing

## Testing normality assumptions

norm_test<-ols_norm_test(fit)

# norm_test

# -----------------------------------------------

# Test Statistic pvalue  

# -----------------------------------------------

# Shapiro-Wilk 0.9844 0.2875

# Kolmogorov-Smirnov 0.0449 0.9877

# Cramer-von Mises 6.899 0.0000

# Anderson-Darling 0.243 0.7612

# -----------------------------------------------

# Result: From above test we can say that given data follows normality assumption.

############# (c) was the three way interaction term needed

# Result: From summary table of fitted regression model, we observe that p value corresponding to three way interaction term is 0.4385. Hence there is no any significant effect of three way interaction term in given regression analysis.

############ (d) Checking Significance of x4, x5,x6,x7

fitx4<-lm(y~x4,data=data)

summary.lm(fitx4)

# Call:

# lm(formula = y ~ x4, data = data)

#

# Residuals:

# Min 1Q Median 3Q Max

# -15.0801 -6.2853 -0.1483 6.9386 20.6762

#

# Coefficients:

# Estimate Std. Error t value Pr(>|t|)   

# (Intercept) 10.378217 1.254125 8.275 6.54e-13 ***

# x4 0.023761 0.008363 2.841 0.00547 **

# ---

# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

#

# Residual standard error: 8.227 on 98 degrees of freedom

# Multiple R-squared: 0.0761, Adjusted R-squared: 0.06667

# F-statistic: 8.072 on 1 and 98 DF, p-value: 0.005468

# Result: Hence interaction term x1*x2 has significantly effect.

fitx5<-lm(y~x5,data=data)

summary.lm(fitx5)

# Call:

# lm(formula = y ~ x5, data = data)

#

# Residuals:

# Min 1Q Median 3Q Max

# -11.153 -4.765 1.312 4.093 11.435

#

# Coefficients:

# Estimate Std. Error t value Pr(>|t|)   

# (Intercept) 6.69489 0.89196 7.506 2.83e-11 ***

# x5 0.26815 0.02742 9.778 3.67e-16 ***

# ---

# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

#

# Residual standard error: 6.09 on 98 degrees of freedom

# Multiple R-squared: 0.4938, Adjusted R-squared: 0.4887

# F-statistic: 95.61 on 1 and 98 DF, p-value: 3.67e-16

# Result: Hence interaction term x1*x3 has significantly effect.

fitx6<-lm(y~x6,data=data)

summary.lm(fitx6)

# Call:

# lm(formula = y ~ x6, data = data)

#

# Residuals:

# Min 1Q Median 3Q Max

# -16.0915 -5.6810 -0.6042 7.4371 18.0240

#

# Coefficients:

# Estimate Std. Error t value Pr(>|t|)   

# (Intercept) 14.449562 1.268078 11.395 <2e-16 ***

# x6 -0.002772 0.001893 -1.464 0.146   

# ---

# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

#

# Residual standard error: 8.467 on 98 degrees of freedom

# Multiple R-squared: 0.02141, Adjusted R-squared: 0.01142

# F-statistic: 2.144 on 1 and 98 DF, p-value: 0.1464

# Result: Hence interaction term x2*x3 has insignificantly effect.

fitx7<-lm(y~x7,data=data)

summary.lm(fitx7)

# Call:

# lm(formula = y ~ x7, data = data)

#

# Residuals:

# Min 1Q Median 3Q Max

# -14.3581 -6.0096 -0.0381 6.8715 20.0486

#

# Coefficients:

# Estimate Std. Error t value Pr(>|t|)   

# (Intercept) 1.082e+01 1.019e+00 10.620 <2e-16 ***

# x7 1.959e-03 5.439e-04 3.601 5e-04 ***

# ---

# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

#

# Residual standard error: 8.044 on 98 degrees of freedom

# Multiple R-squared: 0.1168, Adjusted R-squared: 0.1078

# F-statistic: 12.96 on 1 and 98 DF, p-value: 0.0005003

# Result: Hence interaction term x1*x2*x3 has significantly effect.