Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

1. The dataset prostate (in R package ”faraway”) is from a study on 97 men with

ID: 3202003 • Letter: 1

Question

1. The dataset prostate (in R package ”faraway”) is from a study on 97 men with prostate cancer who were due to receive a radical prostatectomy. Fit a model with lpsa (y) as the response variable and lcavol (x) as the predictor and answer the following questions:
• Construct the analysis of variance table (ANOVA) and test for signicance of regression;

• Test the hypotheses that H0 : 1 = 0.5 vs 1 > 0.5;

• Calculate the coecient of determination (R2);

• Find a 95% condence interval on the slope;

• Find a 95% condence interval on the mean lpsa when lcavol = 1.35;

• Calculate and plot the 95% condence and prediction bands.

Explanation / Answer

Please see the R snippet for your perusal

library("faraway")
# read the data
data<-prostate

# fit the model using lm with lpsa as the dependent variable
fit<- lm(lpsa~lcavol, data = data)

# get the summary of the model
summary(fit)

# the confidence interval of the slope is

confint(fit, 'lcavol', level=0.95)

fit.anova<- aov(lpsa~lcavol, data = data)
summary(fit.anova)

The results are as follows

> summary(fit)

Call:
lm(formula = lpsa ~ lcavol, data = data)

Residuals:
Min 1Q Median 3Q Max
-1.67625 -0.41648 0.09859 0.50709 1.89673

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.50730 0.12194 12.36 <2e-16 ***
lcavol 0.71932 0.06819 10.55 <2e-16 *** # as the p value is less than 0.05 , we can reject the null hypothesis in favor of alternate hypothesis and conclude Beta>0.5

H0 : B1 = 0.5

H1: B1 >0.5
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.7875 on 95 degrees of freedom
Multiple R-squared: 0.5394,   Adjusted R-squared: 0.5346 # answer to question 3
F-statistic: 111.3 on 1 and 95 DF, p-value: < 2.2e-16

>
> # the confidence interval of the slope is
>
> confint(fit, 'lcavol', level=0.95) # answer to question 4
2.5 % 97.5 %
lcavol 0.5839402 0.8547001

>
> fit.anova<- aov(lpsa~lcavol, data = data)
> summary(fit.anova) ## answer to question 1
Df Sum Sq Mean Sq F value Pr(>F)
lcavol 1 69.00 69.00 111.3 <2e-16 ***
Residuals 95 58.91 0.62
  
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Hope this helps !!

Please note that we can answer only 4 parts of a question at a time , as per the answering guidelines