It is common practice to use multiple regression models in the transportation pl
ID: 3358246 • Letter: I
Question
It is common practice to use multiple regression models in the transportation planning process to predict the number of trips that will be made in any geographic sub area (zone) of a region. Trip frequency is generally expressed as a function of variables such as distance of a zone to the Central Business District (CBD) and number of autos owned within the zone. The following data come from the Penn-Jersey Transportation Study. Total Person Linear Distance Number of Autos Zone Trips (1000s) to the CBD (Miles) Owned (1000s) 1) 592 1.6 69 2) 317 2.1 41 3) 135 3.9 22 4) 478 3.0 60 5) 262 4.3 38 6) 82 6.1 15 7) 392 5.9 64 8) 456 6.0 67 9) 88 9.8 13 10) 136 6.1 22 11) 39 9.1 8 a. To carry out the forecasting procedure, a regression model is used to determine the relationship between the dependent variable total person trips and the independent variables distance to the CBD and number of autos owned. Derive the regression equation for the sample data using SAS, generating the parameter estimates, standard errors for the estimates of 2 and 3, and the R2.
b) Test the hypothesis that there is no relationship between the dependent variable and the independent variables taken jointly (i.e., test the significance of the regression). Then test the hypotheses that distance and autos owned are individually unrelated to the dependent variable. Interpret your results. Use = .01.
c) Predict the number of total person trips for a new zone that was 58.4 miles from the CBD and contained 500 owned autos. Are these predictions reliable? Why or why not?
Explanation / Answer
We can do this using the open source statistical package R , as SAS is a commercial software
Please see the complete R snippet as follows
# read the data into R dataframe
data.df<- read.csv("C:\Users\586645\Downloads\Chegg\person.csv",header=TRUE)
str(data.df)
### fit the regression
fit <- lm(Person ~ Distance + Autos, data=data.df)
summary(fit)
The summary of the model is
summary(fit)
Call:
lm(formula = Person ~ Distance + Autos, data = data.df)
Residuals:
Min 1Q Median 3Q Max
-57.42 -16.42 -11.13 25.07 63.31
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 51.601 52.390 0.985 0.353
Distance -10.122 5.976 -1.694 0.129
Autos 7.149 0.667 10.718 5.05e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 39.5 on 8 degrees of freedom
Multiple R-squared: 0.965, Adjusted R-squared: 0.9563
F-statistic: 110.3 on 2 and 8 DF, p-value: 1.499e-06
The regression equation is formed using the coefficients
Person = 51.601 -10.122*Distance + 7.149*Autos
put distance = 58.4 and autos = 500 in the regression equation
Person = 51.601 -10.122*58.4 + 7.149*500
= 3035 approx
We now check the p values of distance and Autos to check if the beta values are signficant or not
Distance p value is 0.129 , which is not less than 0.01 , hence the coefficient of Distance is not signficant
Autos p value is 5.05e-06 , which is less than 0.01 , hence the coefficient of Number of autos is statistically signficant
so Predictor Autos is relaiable , while Distance variable is not reliable
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.