Use the computer to simulate 100 data points from a normal distribution with mea

ID: 3065229 • Letter: U

Question

Use the computer to simulate 100 data points from a normal distribution with mean 0 and variance 1. Store the results in a column called Y.Repeat this process 10 more times, storing results in X1;X2.... .X10. Notice that the Y should be totally unrelated to the explanatory variables. (a) Fit the regression of Y on all 10 explanatory variables. What is R2? (b) What modell is suggested by forward selection? (c) Which model has the smallest Cp statistic? (d) Which model has the smallest BIC? (e) What danger (if any) is there in using a variable selection technique when the number of explanatory variables is a substantial proportion of the sample size?

Explanation / Answer

Ans : Here, i gives the answer of these questions by using R- software

As given information we generate the data set as,

Y=rnorm(100,0,1);Y
X <- matrix(rnorm(1000), 100, 10)
X1=as.data.frame(X);X1
data=data.frame(X,Y);data

(a) Fitting regression model on the data

model=lm(Y~.,data=X1)
summary(model)

we gives the following summary

Call:

lm(formula = Y ~ ., data = X1)

Residuals:

Min 1Q Median 3Q Max

-3.1810 -0.7501 0.0168 0.7335 2.2578

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) -0.0009599 0.1160404 -0.008 0.9934

V1 0.0938381 0.1201722 0.781 0.4370

V2 0.0931644 0.1230150 0.757 0.4508

V3 0.0436138 0.1181784 0.369 0.7130

V4 0.0203657 0.1175300 0.173 0.8628

V5 -0.0606241 0.1202659 -0.504 0.6154

V6 0.1439136 0.1139495 1.263 0.2099

V7 -0.1617294 0.1138754 -1.420 0.1590

V8 -0.0749428 0.1205066 -0.622 0.5356

V9 0.0488038 0.1176421 0.415 0.6793

V10 -0.2142865 0.1275966 -1.679 0.0966 .

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.117 on 89 degrees of freedom

Multiple R-squared: 0.07597, Adjusted R-squared: -0.02785

F-statistic: 0.7318 on 10 and 89 DF, p-value: 0.6927

R2 value = 0.07597

(b) Model selection

require(leaps)
library("leaps")
b=regsubsets(Y~., data=X1,method="forward");b
rs=summary(b);rs;rs$which;rs$cp;rs$adjr2;rs$bic;rs$rsq;

We gives

Forward selectio algorithm gives the result

Selection Algorithm: forward
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1 ( 1 ) " " " " " " " " " " " " " " " " " " "*"
2 ( 1 ) " " " " " " " " " " " " "*" " " " " "*"
3 ( 1 ) " " " " " " " " " " "*" "*" " " " " "*"
4 ( 1 ) "*" " " " " " " " " "*" "*" " " " " "*"
5 ( 1 ) "*" "*" " " " " " " "*" "*" " " " " "*"
6 ( 1 ) "*" "*" " " " " " " "*" "*" "*" " " "*"
7 ( 1 ) "*" "*" " " " " "*" "*" "*" "*" " " "*"
8 ( 1 ) "*" "*" " " " " "*" "*" "*" "*" "*" "*"

Means select [1,2,5,6,7,8,9,10] variables

(c) Smallest Cp

Smallest Cp is -1.5096560 which is for the model which contain only intercept and X10 variable

(d) Smallest BIC

Smallest BIC is 7.295025 which is for the model which contain only intercept and X10 variable

(e) In given problem there are all explonatory variables are correlated so there is problem occures for variable selections.

Navigate

Use the computer display to answer the question. A collection of paired data con

Use the concentration and rate data presented in the table below to answer the f

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

Use the computer to simulate 100 data points from a normal distribution with mea

Question

Explanation / Answer

Related Questions

Navigate