In R console; require(faraway) summary(prostate) Can you stratify the responses
ID: 3301058 • Letter: I
Question
In R console;
require(faraway)
summary(prostate)
Can you stratify the responses as a, b, and c to match the questions? Thank you for your time.
This is an R coding question with Julian J Faraway's book Linear Models with R; with the prostate dataset in R; >Require(faraway) >Summary(prostate)
For the prostate data, fit a model with lpsa as the response and the other variables as predictors: a) Compute and display a 95% joint confidence region for the parameters associate with age and Lbph. Plot the origin on display. The location of the origin on the display tells us the outcome of a certain hypothesis test. State that test and its outcome. b) In the text, we made a permutation test corresponding to the F test for the significance of all the predictors. Exucte the permutation test corresponding to the t-test for age in this model. (Hint: (Summary(g)$coef[4,3] gets you the t-statistic you need if the model is called g. c) Remove all the predictors that are not significant at the 5% level. Test this model against the original model. What model is preferred?
Explanation / Answer
> library(faraway)
> data(prostate)
> attach(prostate)
If x = lcavol and y=lpsa:
> fit1 = lm(lpsa ~ lcavol)
> fit2 = lm(lcavol ~ lpsa)> plot(lcavol, lpsa)
> abline(fit1$coeff[1], fit1$coeff[2])
> abline(-fit2$coeff[1]/ fit2$coeff[2], 1/ fit2$coeff[2])
The least square regression line always goes through point (x,y).
Therefore, two regression line intersect at the point of averages.
> mean(lcavol)
[1] 1.350
> mean(lpsa)
[1] 2.478
If x=lpsa and y= lcavol:
> fit1 = lm(lcavol ~ lpsa)
> fit2 = lm(lpsa ~ lcavol)
> plot(lpsa, lcavol)
> abline(fit1$coeff[1], fit1$coeff[2])
> abline(-fit2$coeff[1]/ fit2$coeff[2], 1/ fit2$coeff[2])
> library(faraway)
> data(prostate)
> attach(prostate)
> fit1 = lm(lpsa~lcavol+lweight+age+lbph+svi+lcp+gleason+pgg45)
> summary(fit1)
Call:lm(formula = lpsa ~ lcavol + lweight + age + lbph + svi + lcp +gleason + pgg45)
Residuals:
Min 1Q Median 3Q Max
-1.7331 -0.3713 -0.0170 0.4141 1.6381
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.669337 1.296387 0.516 0.60693
lcavol 0.587022 0.087920 6.677 2.11e-09 ***
lweight 0.454467 0.170012 2.673 0.00896 **
age -0.019637 0.011173 -1.758 0.08229.
lbph 0.107054 0.058449 1.832 0.07040 .
svi 0.766157 0.244309 3.136 0.00233 **
lcp -0.105474 0.091013 -1.159 0.24964
gleason 0.045142 0.157465 0.287 0.77503
pgg45 0.004525 0.004421 1.024 0.30886---
Signif. codes:0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.7084 on 88 degrees of freedom
Multiple R-Squared: 0.6548,
Adjusted R-squared: 0.6234
F-statistic: 20.86 on 8 and 88 DF,
p-value: < 2.2e-16
There are 8 independent variables name as lcavol, lweight, age, lbph, svi, lcp, gleason and pgg45 with dependent variable lpsa.
We can test two hypothesis :
i) Overall significance :
Here we have to test,
H0 : Bj = 0 Vs h1 : Bj not= 0
where Bj is population slope for jth independent variable.
Assume alpha = level of significance = 0.05
Here test statistic follows F-distribution.
F= 20.86
P-value = 2.2e-16 = 0.000
P-value < alpha
Reject H0 at 5% level of significance.
Conclusion : Atleast one of the slope is differ than 0.
ii) Individual significance :
Here we have to test the hypothesis that,
H0 : B = 0 Vs H1 : B not = 0
where B is population slope for individual variable.
Here test statistic follows t-distribution.
From the given output we can say that the variables age, lbph, lcp, gleason, pgg45 are insignificant variables since p-value for all these variables is greator than level of significance.
We will eliminate these variables from the model.
And lcavol, lweight and svi are significant variables since p-value is less than level of significance.
We will include these variables into the model.
confint(fit1,"age",level=0.95)
2.5 % 97.5 %
age -0.04184062 0.002566267
> confint(fit1,"age",level=0.90)
5 % 95 %
age -0.0382102 -0.001064151
Consider H0:age= 0 vs. H1:age0.
95% CI for age does cover 0 Do NOT Reject H0 at = 0.05.
90% CI for age does NOT cover 0 Reject H0 at = 0.10.
0.05 < p-value< 0.10.
b)Compute and display a 95% joint confidence region for the parameters associatedwithageandlbph.Plot the origin on this display.The location of the origin onthe display tells us the outcome of a certain hypothesis test.State that test and itsoutcome.
> library(ellipse)
> plot(ellipse(fit1,c(4,5),level=0.95),type="l")
> points(fit1$coeff[4],fit1$coeff[5])
> points(0,0,pch=3)
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.