Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

By using R Language, A chemical engineer studied the effect of the amount of sur

ID: 3174564 • Letter: B

Question

By using R Language, A chemical engineer studied the effect of the amount of surfactant and time on

clathrate formation. Clathrates are used as cool storage media. The dataset is

Clathrate.txt. Here are the variables:

y: Clathrate formation (mass %)

x1: Amount of surfactant (mass %)

x2: Time (minutes)

(a) Fit a multiple linear regression model relating clathrate formation to these

regressors.

(b) Test if the model you t in part (a) is significant. What conclusions can you

draw?

(c) Use t tests to assess the contribution of each regressor to the model. Discuss

your findings.

(d) What is the R2 for model? Compare it to the R2 for the simple linear regression

model relating clathrate formation to time. Discuss your results.

(e) Find a 95% confidence interval for the regression coefficient for time for both

models in part (d). Discuss any differences.

(f) Construct a normality plot of the residuals from the full model. Does there

seem to be any problem with the normality assumption?

(g) Construct and interpret a plot of the residuals versus the predicted response.

(h) Find the values of the Cook's distance and construct the Cook's Distance plot.

Explanation / Answer

a) fitting of multiple regression model:

> mydata=read.table(file.choose(),header=TRUE)

output:

fit=lm(y~x1+x2,data=mydata)
> fit

Call:
lm(formula = y ~ x1 + x2, data = mydata)

Coefficients:
(Intercept) x1 x2
-0.0157231 -0.0002281 0.0020058

> summary(fit)

Call:
lm(formula = y ~ x1 + x2, data = mydata)

Residuals:
Min 1Q Median 3Q Max
-0.016558 -0.007619 -0.002437 0.008203 0.028162

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.572e-02 5.460e-03 -2.880 0.00694 **
x1 -2.281e-04 3.248e-05 -7.021 4.95e-08 ***
x2 2.006e-03 2.273e-04 8.823 3.38e-10 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.01145 on 33 degrees of freedom
Multiple R-squared: 0.7071, Adjusted R-squared: 0.6894
F-statistic: 39.84 on 2 and 33 DF, p-value: 1.586e-09

b) H0: the model is not significant

H1: model is significant

decision:

If pvalue is less than alpha=0.05 that is level of significance then we reject H0

from the above output pvalue for variable x1 & x2 are near equal to zero

which is less than 0.05

so we reject H0.

therefore model is significant .that is there is a linear relation ship between Y wih x1 & x2.

d) Rsquared for the above model is;

Multiple R-squared: 0.7071

Rsquared for a simple linear regression model that is y with x2 is

Multiple R-squared: 0.2696,

output:

f1=lm(y~x2,data=mydata)
> summary(f1)

Call:
lm(formula = y ~ x2, data = mydata)

Residuals:
Min 1Q Median 3Q Max
-0.025451 -0.017085 0.003063 0.012700 0.040549

Coefficients:
Estimate Std. Error t value Pr(>|t|)   
(Intercept) -0.0097496 0.0083909 -1.162 0.25335   
x2 0.0009143 0.0002581 3.543 0.00117 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.01781 on 34 degrees of freedom
Multiple R-squared: 0.2696, Adjusted R-squared: 0.2481
F-statistic: 12.55 on 1 and 34 DF, p-value: 0.001174

e) 95% confidence interval for multiple model:

OUtput:

new=data.frame(x2=36,x1=36)
> predict(fit,new,interval="confidence")
fit lwr upr
1 0.04827713 0.0398171 0.05673715
>

predict(f1,new,interval="confidence")
fit lwr upr
1 0.02316554 0.01645907 0.029872
>