DATA SET : https://mega.nz/#!wNkg1abA!iEcJwN6Fno1c2Q4iSVUMbhf27R2AoIsnMWkdG27cvM
ID: 3322710 • Letter: D
Question
DATA SET : https://mega.nz/#!wNkg1abA!iEcJwN6Fno1c2Q4iSVUMbhf27R2AoIsnMWkdG27cvMw
12.127 (Refer to the CLOCKS.cSV data set) Fit a first-order linear model which relates clock sale price (y) to the age of the clock, the number of bidders and the interaction between age and number of bidders. Based upon the results of the regression, conduct residual analysis and comment on the residual assumptions (satisfied/violated, etc.): a. assumption of mean error 0? b. assumption of constant error variance? c. are there outliers present? d. assumption of normally distributed errors? e. Is there any evidence of multicollinearity? (Look at the correlation matrix between the xExplanation / Answer
We use R and Excel to solve this problem.
Given, y = clock sale price and the independent variables (x's) are age of clock, number of bidders and interaction between age and number of bidders.
We fit the linear model between y and x's and then calculate the residuals from the observed and fitted values of the dependent variable (y).
The R code has been pasted underneath for reference. The codes have been supported with comment lines.
library(car)
# READING THE DATA
data = read.csv("C:\Users\LAPTOP\Desktop\CLOCKS.csv")
names(data)
attach(data)
# FITTING THE LINEAR MODEL
model = lm(PRICE ~ AGE + NUMBIDS + AGE.BID, data)
summary(model)
# CALCULATING THE RESIDUALS
resid = PRICE - fitted.values(model)
# TEST FOR CHECKING MEAN ERROR = 0
t.test(resid,alternative = "two.sided",mu=0)
# TEST FOR CHECKING CONSTANT ERROR VARIANCE
ncvTest(model)
# TEST FOR OUTLIERS
boxplot(resid)
# TEST OF NORMALLY DISTRIBUTED ERRORS
shapiro.test(resid)
# TEST FOR MULTICOLLINEARITY
cor(cbind(AGE,NUMBIDS,AGE.BID))
Now, based on the R output, we answer the questions.
(a) We performed a t test on the residual values to check for the assumption of mean error = 0. Since p-value is greater than alpha = 0.05, we fail to reject null hypothesis and conclude that there is no significant evidence to disprove that the mean error is not 0.
(b) We perform Breusch Pagan test to test for constant error variance. Since p-value is greater than alpha = 0.05, we fail to reject null hypothesis and conclude that there is no significant evidence to disprove that the error variance is constant.
(c) By drawing a boxplot, we observe that there are no outliers present.
(d) We performed a Shapiro Wilk test for testing whether errors are derived from a normal distribution. Since p-value is greater than alpha = 0.05, we fail to reject null hypothesis and conclude that there is no significant evidence to disprove that the errors are from a normal distribution.
(e) From the correlation matrix, we observe that all the 3 variables are significantly correlated with each other, based on the data we have. Hence, we say that there is multicollinearity present in the data.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.