Use the R programming software to answer the question. The prostate dataset is f
ID: 3226853 • Letter: U
Question
Use the R programming software to answer the question. The prostate dataset is found in the downloadable 'faraway' package for the R programming software.You must download the package to access the data. Then please answer the question and please include the code used to find the answers!!!Please do not respond saying 'data not found' i stated that it is found in the downloadable faraway package using R software.
Thanks in advance! Use the R programming software to answer the question. The prostate dataset is found in the downloadable 'faraway' package for the R programming software.You must download the package to access the data. Then please answer the question and please include the code used to find the answers!!!
Please do not respond saying 'data not found' i stated that it is found in the downloadable faraway package using R software.
Thanks in advance! 3. Use the prostate data with lpsa as the response and the other variables as predictors Implement the following variable selection criterion to determine the "best" model using forward selection procedure: AIC BIC Adjusted R Mallows C
Explanation / Answer
Note:since rscript canyt be uploaded ..i have cpoy pasted the codes and respective outputs
#forward selection using mallow cp
library(faraway)
## Warning: package 'faraway' was built under R version 3.2.5
library(leaps)
## Warning: package 'leaps' was built under R version 3.2.5
leaps( x=prostate[,1:8], y=prostate[,9], names=names(prostate)[1:8], method="Cp")
## $which
## lcavol lweight age lbph svi lcp gleason pgg45
## 1 TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 1 FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
## 1 FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
## 1 FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
## 1 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
## 1 FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## 1 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
## 1 FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
## 2 TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## 2 TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
## 2 TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
## 2 TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
## 2 TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
## 2 TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
## 2 TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
## 2 FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE
## 2 FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
## 2 FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
## 3 TRUE TRUE FALSE FALSE TRUE FALSE FALSE FALSE
## 3 TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
## 3 TRUE TRUE FALSE FALSE FALSE FALSE FALSE TRUE
## 3 TRUE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
## 3 TRUE TRUE FALSE FALSE FALSE FALSE TRUE FALSE
## 3 TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
## 3 TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE
## 3 TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE
## 3 TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE
## 3 TRUE FALSE FALSE FALSE TRUE TRUE FALSE FALSE
## 4 TRUE TRUE FALSE TRUE TRUE FALSE FALSE FALSE
## 4 TRUE TRUE FALSE FALSE TRUE FALSE FALSE TRUE
## 4 TRUE TRUE TRUE FALSE TRUE FALSE FALSE FALSE
## 4 TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE
## 4 TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE
## 4 TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE
## 4 TRUE FALSE FALSE TRUE TRUE TRUE FALSE FALSE
## 4 TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE
## 4 TRUE FALSE FALSE TRUE TRUE FALSE TRUE FALSE
## 4 TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE
## 5 TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE
## 5 TRUE TRUE FALSE TRUE TRUE FALSE FALSE TRUE
## 5 TRUE TRUE FALSE TRUE TRUE FALSE TRUE FALSE
## 5 TRUE TRUE FALSE TRUE TRUE TRUE FALSE FALSE
## 5 TRUE TRUE TRUE FALSE TRUE FALSE FALSE TRUE
## 5 TRUE TRUE TRUE FALSE TRUE FALSE TRUE FALSE
## 5 TRUE TRUE FALSE FALSE TRUE TRUE FALSE TRUE
## 5 TRUE TRUE FALSE FALSE TRUE TRUE TRUE FALSE
## 5 TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE
## 5 TRUE TRUE TRUE FALSE TRUE TRUE FALSE FALSE
## 6 TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE
## 6 TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE
## 6 TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE
## 6 TRUE TRUE FALSE TRUE TRUE TRUE FALSE TRUE
## 6 TRUE TRUE TRUE FALSE TRUE TRUE FALSE TRUE
## 6 TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE
## 6 TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE
## 6 TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE
## 6 TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE
## 6 TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE
## 7 TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE
## 7 TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE
## 7 TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE
## 7 TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE
## 7 TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE
## 7 TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE
## 7 TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE
## 7 FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## 8 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
##
## $label
## [1] "(Intercept)" "lcavol" "lweight" "age" "lbph"
## [6] "svi" "lcp" "gleason" "pgg45"
##
## $size
## [1] 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5
## [36] 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9
##
## $Cp
## [1] 24.394559 80.172023 85.118725 116.430859 127.187098 129.926898
## [7] 153.649788 154.559673 14.541475 15.958255 19.887125 23.011181
## [13] 25.087007 25.566479 26.389504 60.098921 64.083099 68.774715
## [19] 6.216935 9.208478 12.658030 14.873508 14.985154 15.196243
## [25] 15.704613 17.098045 17.664662 17.671625 5.626422 7.074224
## [31] 7.414893 7.441786 8.089176 10.141420 10.822624 10.859249
## [37] 11.128887 12.730876 5.715016 6.922392 7.202880 7.422583
## [43] 7.682832 8.157323 8.235184 8.987779 9.038131 9.271914
## [49] 6.401965 6.806372 7.457430 8.104089 8.480237 8.724817
## [55] 8.912378 9.519557 9.585948 10.190270 7.082184 8.047624
## [61] 8.343017 10.089156 10.354661 14.145690 16.834595 51.578982
## [67] 9.000000
The first part of the output, denoted $which, lists seven possible sub-models in seven rows. The first column indicates the number of predictors in the sub-model for each row. The variables in each sub-model are those designated TRUE in each row.
The next two parts of the output don't give us any new information, but the last part, designated $Cp, gives us the value of the Mallows' Cp criterion for each sub-model, in the same order. The best sub-model is that for which the Cp value is closest to p (the number of parameters in the model, including the intercept). For the full model, we always have Cp = p. The idea is to find a suitable reduced model, if possible. Here the best reduced model is the third one, for which Cp = 6.216935 and p = 3.
forward selection using R square
leaps( x=prostate[,1:8], y=prostate[,9], names=names(prostate)[1:8], method="adjr2")
## $which
## lcavol lweight age lbph svi lcp gleason pgg45
## 1 TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## 1 FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
## 1 FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
## 1 FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
## 1 FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
## 1 FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## 1 FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
## 1 FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
## 2 TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
## 2 TRUE FALSE FALSE FALSE TRUE FALSE FALSE FALSE
## 2 TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
## 2 TRUE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
## 2 TRUE FALSE FALSE FALSE FALSE TRUE FALSE FALSE
## 2 TRUE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
## 2 TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE
## 2 FALSE TRUE FALSE FALSE TRUE FALSE FALSE FALSE
## 2 FALSE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
## 2 FALSE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
## 3 TRUE TRUE FALSE FALSE TRUE FALSE FALSE FALSE
## 3 TRUE FALSE FALSE TRUE TRUE FALSE FALSE FALSE
## 3 TRUE TRUE FALSE FALSE FALSE FALSE FALSE TRUE
## 3 TRUE TRUE FALSE FALSE FALSE TRUE FALSE FALSE
## 3 TRUE TRUE FALSE FALSE FALSE FALSE TRUE FALSE
## 3 TRUE TRUE FALSE TRUE FALSE FALSE FALSE FALSE
## 3 TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE
## 3 TRUE FALSE FALSE FALSE TRUE FALSE FALSE TRUE
## 3 TRUE FALSE FALSE FALSE TRUE FALSE TRUE FALSE
## 3 TRUE FALSE FALSE FALSE TRUE TRUE FALSE FALSE
## 4 TRUE TRUE FALSE TRUE TRUE FALSE FALSE FALSE
## 4 TRUE TRUE FALSE FALSE TRUE FALSE FALSE TRUE
## 4 TRUE TRUE TRUE FALSE TRUE FALSE FALSE FALSE
## 4 TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE
## 4 TRUE TRUE FALSE FALSE TRUE TRUE FALSE FALSE
## 4 TRUE FALSE TRUE TRUE TRUE FALSE FALSE FALSE
## 4 TRUE FALSE FALSE TRUE TRUE TRUE FALSE FALSE
## 4 TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE
## 4 TRUE FALSE FALSE TRUE TRUE FALSE TRUE FALSE
## 4 TRUE TRUE TRUE FALSE FALSE FALSE FALSE TRUE
## 5 TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE
## 5 TRUE TRUE FALSE TRUE TRUE FALSE FALSE TRUE
## 5 TRUE TRUE FALSE TRUE TRUE FALSE TRUE FALSE
## 5 TRUE TRUE FALSE TRUE TRUE TRUE FALSE FALSE
## 5 TRUE TRUE TRUE FALSE TRUE FALSE FALSE TRUE
## 5 TRUE TRUE TRUE FALSE TRUE FALSE TRUE FALSE
## 5 TRUE TRUE FALSE FALSE TRUE TRUE FALSE TRUE
## 5 TRUE TRUE FALSE FALSE TRUE TRUE TRUE FALSE
## 5 TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE
## 5 TRUE TRUE TRUE FALSE TRUE TRUE FALSE FALSE
## 6 TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE
## 6 TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE
## 6 TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE
## 6 TRUE TRUE FALSE TRUE TRUE TRUE FALSE TRUE
## 6 TRUE TRUE TRUE FALSE TRUE TRUE FALSE TRUE
## 6 TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE
## 6 TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE
## 6 TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE
## 6 TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE
## 6 TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE
## 7 TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE
## 7 TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE
## 7 TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE
## 7 TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE
## 7 TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE
## 7 TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE
## 7 TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE
## 7 FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## 8 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
##
## $label
## [1] "(Intercept)" "lcavol" "lweight" "age" "lbph"
## [6] "svi" "lcp" "gleason" "pgg45"
##
## $size
## [1] 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5
## [36] 5 5 5 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 8 8 9
##
## $adjr2
## [1] 0.53458382 0.31345148 0.29384005 0.16970165 0.12705803 0.11619597
## [7] 0.02214547 0.01853819 0.57712461 0.57144796 0.55570607 0.54318885
## [13] 0.53487159 0.53295048 0.52965284 0.39458857 0.37862507 0.35982707
## [19] 0.61438994 0.60227477 0.58830476 0.57933250 0.57888035 0.57802549
## [25] 0.57596669 0.57032356 0.56802887 0.56800067 0.62080363 0.61487658
## [31] 0.61348194 0.61337185 0.61072155 0.60232002 0.59953129 0.59938135
## [37] 0.59827750 0.59171924 0.62454759 0.61955049 0.61838960 0.61748029
## [43] 0.61640317 0.61443934 0.61411708 0.61100224 0.61079384 0.60982626
## [49] 0.62587075 0.62417839 0.62145384 0.61874770 0.61717360 0.61615008
## [55] 0.61536518 0.61282425 0.61254642 0.61001745 0.62725212 0.62316656
## [61] 0.62191651 0.61452716 0.61340359 0.59736065 0.58598169 0.43894972
## [67] 0.62336809
The highest value for either criteria indicates the best sub-model.
adjr2=0.62587075 and p=8
forward selection using AIC and Bic
nothing<-lm(lpsa~1,data=prostate)
fullmode<-lm(lpsa~.,data=prostate)
forwards = step(nothing,scope=list(lower=formula(nothing),upper=formula(fullmode)),direction="forward")
## Start: AIC=28.84
## lpsa ~ 1
##
## Df Sum of Sq RSS AIC
## + lcavol 1 69.003 58.915 -44.366
## + svi 1 41.011 86.907 -6.658
## + lcp 1 38.528 89.389 -3.926
## + pgg45 1 22.814 105.103 11.783
## + gleason 1 17.416 110.501 16.641
## + lweight 1 16.041 111.876 17.840
## + lbph 1 4.136 123.782 27.650
## + age 1 3.679 124.238 28.007
## <none> 127.918 28.837
##
## Step: AIC=-44.37
## lpsa ~ lcavol
##
## Df Sum of Sq RSS AIC
## + lweight 1 5.9485 52.966 -52.690
## + svi 1 5.2375 53.677 -51.397
## + lbph 1 3.2658 55.649 -47.898
## + pgg45 1 1.6980 57.217 -45.203
## <none> 58.915 -44.366
## + lcp 1 0.6562 58.259 -43.453
## + gleason 1 0.4156 58.499 -43.053
## + age 1 0.0025 58.912 -42.370
##
## Step: AIC=-52.69
## lpsa ~ lcavol + lweight
##
## Df Sum of Sq RSS AIC
## + svi 1 5.1814 47.785 -60.676
## + pgg45 1 1.9489 51.017 -54.327
## <none> 52.966 -52.690
## + lcp 1 0.8371 52.129 -52.236
## + gleason 1 0.7810 52.185 -52.131
## + lbph 1 0.6751 52.291 -51.935
## + age 1 0.4200 52.546 -51.463
##
## Step: AIC=-60.68
## lpsa ~ lcavol + lweight + svi
##
## Df Sum of Sq RSS AIC
## + lbph 1 1.30006 46.485 -61.352
## <none> 47.785 -60.676
## + pgg45 1 0.57347 47.211 -59.847
## + age 1 0.40251 47.382 -59.497
## + gleason 1 0.38901 47.396 -59.469
## + lcp 1 0.06412 47.721 -58.806
##
## Step: AIC=-61.35
## lpsa ~ lcavol + lweight + svi + lbph
##
## Df Sum of Sq RSS AIC
## + age 1 0.95924 45.526 -61.374
## <none> 46.485 -61.352
## + pgg45 1 0.35332 46.131 -60.092
## + gleason 1 0.21256 46.272 -59.796
## + lcp 1 0.10230 46.383 -59.565
##
## Step: AIC=-61.37
## lpsa ~ lcavol + lweight + svi + lbph + age
##
## Df Sum of Sq RSS AIC
## <none> 45.526 -61.374
## + pgg45 1 0.65896 44.867 -60.789
## + gleason 1 0.45601 45.070 -60.351
## + lcp 1 0.12927 45.396 -59.650
The lowest value for either criteria indicates the best sub-model.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.