Suppose the following data represents Income (Y) and Age (X) for some working ag
ID: 3155079 • Letter: S
Question
Suppose the following data represents Income (Y) and Age (X) for some working age people:
Use all diagnostic methods we have discussed to find a satisfactory regression model.
a) Write down the equation of the fitted regression of Income (Y) on age (X).
b) Calculate (show work) and interpret the R^2 for this model.
c) What is the predicted income of someone who is 30 years old? What is a 99% interval for this point estimate? (Yes, the language is explicit to tell what is required.)
d) Is this model “good”? Justify.
e) Consider a regression model using LN[Income] (Y) and Age (X). Would such a model be “better”? Use multiple competing metrics to justify a conclusion.
f) Using the LN[] model, what is the predicted income of someone who is 30 years old?
Need help with c through f
Y's: 44332, 53842, 44332,47781, 26986, 50340, 53136, 22931,48581,50965, 29863, 35459, 44904, 49322,49673, 47781,39979, 48581, 45444,50012, 43081, 48189, 47355,36794, 50657) X's: 137, 64, 37, 44, 23, 51,61,18, 46, 53, 24, 27, 38, 48, 49, 44, 31,46,39, 50, 35, 45, 43, 28, 52)Explanation / Answer
YOu want answers of the part c through f.
FOr c part we need a) part.
This we can done using MINITAB.
Minitab Output is :
————— 4/18/2016 8:39:36 PM ————————————————————
Welcome to Minitab, press F1 for help.
Regression Analysis: y versus x
The regression equation is
y = 17344 + 655 x
Predictor Coef SE Coef T P
Constant 17344 2121 8.18 0.000
x 655.11 49.48 13.24 0.000
S = 2825.87 R-Sq = 88.4% R-Sq(adj) = 87.9%
Analysis of Variance
Source DF SS MS F P
Regression 1 1399689186 1399689186 175.28 0.000
Residual Error 23 183666844 7985515
Total 24 1583356030
The regression equation is
y = 17344 + 655 x
c) What is the predicted income of someone who is 30 years old?
That is we have to find y when x = 30 years.
y = 17344 + 655*30 = 36994
The predicted income of someone who is 30 years old is 36994.
hat is a 99% interval for this point estimate?
That is here we have to find 99% confidence interval for B.
99% confidence interval for B is,
b - E < B < b + E
where b is slope for x = 655
E is the margin of error.
E = tc*SEb
where tc is the critical value for t-distribution.
tc we can find by using EXCEL.
syntax is,
=TINV(probability, deg_freedom)
where probability = 1 - c
c is confidence level = 99% = 0.99
deg_freedom = n - 2 = 25-2 = 23
tc = 2.807
SEb = 49.48
E = 2.807*49.48 = 138.907.
lower limit = b - E = 655 - 138.907 = 516.093
upper limit = b + E = 655 + 138.907 = 793.907
99% confidence interval interval for this point estimate is (516.093, 793.907).
d) Is this model “good”? Justify.
This model is good because in the ANOVA table test statistic F = 175.28 and P-value = 0.000.
The overall significance P-value is 0.000.
P-value < alpha
Assume alpha = 5% = 0.05
Model is best.
e) Consider a regression model using LN[Income] (Y) and Age (X). Would such a model be “better”?
In this part take LN of Y and X and we have to fit regression on LN(Y) Vs X.
This also we can done using MINITAB.
steps :
ENTER all the data in MINITAB sheet --> STAT --> Regression --> Regression --> Response : LN(Y) --> Predictors : X --> Results : select second option --> ok --> ok
Regression Analysis: LN(Y) versus x
The regression equation is
LN(Y) = 9.99 + 0.0168 x
Predictor Coef SE Coef T P
Constant 9.98818 0.07177 139.17 0.000
x 0.016772 0.001674 10.02 0.000
S = 0.0956093 R-Sq = 81.4% R-Sq(adj) = 80.5%
Analysis of Variance
Source DF SS MS F P
Regression 1 0.91749 0.91749 100.37 0.000
Residual Error 23 0.21025 0.00914
Total 24 1.12774
This model is also better because Overall P-value = 0.000.
The overall model is best.
f) Using the LN[] model, what is the predicted income of someone who is 30 years old?
when x = 30 years we have to find ln(y).
The regression equation is
LN(Y) = 9.99 + 0.0168 x
LN(Y) = 9.99 + 0.0168*30 = 10.494
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.