PLEASE use R and show in detail the codes. Problem # 2: Residential sales that o
ID: 3043427 • Letter: P
Question
PLEASE use R and show in detail the codes.
Problem # 2: Residential sales that occurred during the year 2005 were available from a city in the Midwest. Data on 50 arms-length transactions include sales price (y in thousand), finished square feet (x1, in thousand), number of bedrooms (x2), lot size (x3, in thousand), year built (x4, consider 2005=50), distance from a popular highway (x5, in mile). The city tax assessor was interested in predicting sales price based on the demographic variable information given above. The data have not produced here. However, the Splus output are provided below.
Coefficients:
..................Value ...........Std Error .......t value .......Pr( > | t | )
(Intercept) 314.1311 ......143.5370 .....2.1885 ........0.0340
x1 ............0.0291 ...........0.0142 .........2.0496 ........0.0464
x2 ...........16.7504 ..........7.5896 ..........2.2070 ........0.0326
x3 ...........1.9283 ............3.1550 ...........0.6112 ........0.5442
x4.......... - 1.9159 ............2.4940 ..... - 0.7682 .........0.4465
x5........... - 5.1714 ............2.0729 ........- 2.4948 ........0.0164
Residual Standard error: 89.77 on 44 degrees of fredom.
Multiple R-Squared: 0.7266
F-statistic: 23.390 on 5 and 44 degrees of fredom, the p-value is 0.0000
Correlation of Coefficients:
(Intercept) x1 x2 x3 x4
x1 -0.6466
x2 -0.5736 0.0299
x3 -0.4838 0.2660 0.3397
x4 -0.1646 -0.0746 0.1487 0.2737
x5 -0.1623 -0.0625 0.0732 -0.0571 -0.2684
(a) Determine the multiple regression equation for the residential sales data.
(b) Suppose you do not know the p-values, find the best predictor for Y.
(c) Interpret the coefficient of determination, R2.
(d) Estimate the sale price of a house whose finished square feet is 2500 (square
feet), 4 bedrooms, 5500 sq feet lot size, built in 2000 and 20 miles far from high way.
(e) Estimate the sale price of a house whose finished square feet is 2500 (square
feet), 4 bedrooms, 7500 sq feet lot size, built in 2000 and 20 miles far from high way. Compare results in (d) and (e).
(f) At the 5% significance level, does it appear that any of the predictor variables can be
removed from the full model as unnecessary?
(g) Obtain and interpret 99% confidence intervals for the coefficient 3.
(h) Test the hypothesis that H0: 2 + 5=9, versus H1: 2 + 5>9. Use =0.02.
Explanation / Answer
Please note that this is an interpretation question , no R codes are needed
a) the regression equation can be formed using the coefficients as
sales = 314.1311 + 0.0291x1 + 16.7504x2 +1.9283x3 -1.9159x4 -5.17x5
c) the coefficient of determination is 0.7266. This means the model is able to explain 72.66% variation in the value of sales due to variation in the independent variables from x1 to x5
d)
using the regression equation and putting the values we get
sales = 314.1311 + 0.0291x1 + 16.7504x2 +1.9283x3 -1.9159x4 -5.17x5
sales = 314.1311 + 0.0291*2500 + 16.7504*4 +1.9283*5500 -1.9159*45 -5.17*5 = 10947.46
e)
using the regression equation and putting the values we get
sales = 314.1311 + 0.0291x1 + 16.7504x2 +1.9283x3 -1.9159x4 -5.17x5
sales = 314.1311 + 0.0291*2500 + 16.7504*4 +1.9283*7500 -1.9159*45 -5.17*5 = 14804.06
f)
check the p values of all the variables from the output table , all variabes whose p value is less than 0.05 are considered signficant , other variables can be dropped from the model
x3 and x4 are not signficant for the model
please note that we can answer only 4 subparts of a question at a time , as per the answering guidelines
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.