Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

It is common practice to use multiple regression models in the transportation pl

ID: 3358246 • Letter: I

Question

It is common practice to use multiple regression models in the transportation planning process to predict the number of trips that will be made in any geographic sub area (zone) of a region. Trip frequency is generally expressed as a function of variables such as distance of a zone to the Central Business District (CBD) and number of autos owned within the zone. The following data come from the Penn-Jersey Transportation Study. Total Person Linear Distance Number of Autos Zone Trips (1000s) to the CBD (Miles) Owned (1000s) 1) 592 1.6 69 2) 317 2.1 41 3) 135 3.9 22 4) 478 3.0 60 5) 262 4.3 38 6) 82 6.1 15 7) 392 5.9 64 8) 456 6.0 67 9) 88 9.8 13 10) 136 6.1 22 11) 39 9.1 8 a. To carry out the forecasting procedure, a regression model is used to determine the relationship between the dependent variable total person trips and the independent variables distance to the CBD and number of autos owned. Derive the regression equation for the sample data using SAS, generating the parameter estimates, standard errors for the estimates of 2 and 3, and the R2.

b) Test the hypothesis that there is no relationship between the dependent variable and the independent variables taken jointly (i.e., test the significance of the regression). Then test the hypotheses that distance and autos owned are individually unrelated to the dependent variable. Interpret your results. Use = .01.

c) Predict the number of total person trips for a new zone that was 58.4 miles from the CBD and contained 500 owned autos. Are these predictions reliable? Why or why not?

Explanation / Answer

We can do this using the open source statistical package R , as SAS is a commercial software

Please see the complete R snippet as follows

# read the data into R dataframe
data.df<- read.csv("C:\Users\586645\Downloads\Chegg\person.csv",header=TRUE)
str(data.df)

### fit the regression

fit <- lm(Person ~ Distance + Autos, data=data.df)
summary(fit)

The summary of the model is

summary(fit)

Call:
lm(formula = Person ~ Distance + Autos, data = data.df)

Residuals:
Min 1Q Median 3Q Max
-57.42 -16.42 -11.13 25.07 63.31

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 51.601 52.390 0.985 0.353
Distance -10.122 5.976 -1.694 0.129
Autos 7.149 0.667 10.718 5.05e-06 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 39.5 on 8 degrees of freedom
Multiple R-squared: 0.965,   Adjusted R-squared: 0.9563
F-statistic: 110.3 on 2 and 8 DF, p-value: 1.499e-06

The regression equation is formed using the coefficients

Person = 51.601 -10.122*Distance + 7.149*Autos

put distance = 58.4 and autos = 500 in the regression equation
Person = 51.601 -10.122*58.4 + 7.149*500
= 3035 approx

We now check the p values of distance and Autos to check if the beta values are signficant or not

Distance p value is 0.129 , which is not less than 0.01 , hence the coefficient of Distance is not signficant


Autos p value is 5.05e-06 , which is less than 0.01 , hence the coefficient of Number of autos is statistically signficant

so Predictor Autos is relaiable , while Distance variable is not reliable

Hire Me For All Your Tutoring Needs
Integrity-first tutoring: clear explanations, guidance, and feedback.
Drop an Email at
drjack9650@gmail.com
Chat Now And Get Quote