Data related to 2015 MLB team performance statistics has been collected. It is e
ID: 3132160 • Letter: D
Question
Data related to 2015 MLB team performance statistics has been collected. It is expected that fans will attend more games for those teams who tend to win more. Thus, the owners want to know what variables ultimately contribute to wins based on their team’s performance as they feel it will boost revenues and warrant higher compensation for their players.
Using multiple regression, correlation analysis, and residual analysis, examine each of the performance statistics and develop the best model to predict the number of wins with a 99% confidence level. (DO NOT include E.R.A in this model) Do you think this information appears to be helpful in predicting the number of wins? Summarize your results and methodology, and explain your perspective of the model you developed.
Team Wins E.R.A. Runs Scored Saves Hits Allowed Walks Allowed Errors Toronto Blue Jays 93 3.80 891 34 1353 397 88 New York Yankees 87 4.03 764 48 1416 474 93 Baltimore Orioles 81 4.05 713 43 1406 483 77 Tampa Bay Rays 80 3.74 644 60 1314 477 95 Boston Red Sox 78 4.31 748 40 1486 478 97 Kansas City Royals 95 3.73 724 56 1372 489 88 Minnesota Twins 83 4.07 696 45 1506 413 86 Cleveland Indians 81 3.67 669 38 1274 425 79 Chicago White Sox 76 3.98 622 37 1443 474 101 Detroit Tigers 74 4.64 689 35 1491 489 86 Texas Rangers 88 4.24 751 45 1459 508 119 Houston Astros 86 3.57 729 39 1308 423 85 Los Angeles Angels 85 3.94 661 46 1355 466 93 Seattle Mariners 76 4.16 656 45 1430 491 94 Oakland A's 68 4.14 694 28 1402 474 126 New York Mets 90 3.43 683 50 1341 383 88 Washington Nationals 83 3.62 703 41 1366 364 90 Miami Marlins 71 4.02 613 35 1374 508 77 Atlanta Braves 67 4.41 573 44 1462 550 90 Philadelphia Phillies 63 4.69 626 35 1592 488 117 St. Louis Cardinals 100 2.94 647 62 1359 477 96 Pittsburgh Pirates 98 3.21 697 54 1392 453 122 Chicago Cubs 97 3.36 689 48 1276 407 111 Milwaukee Brewers 68 4.28 655 40 1432 517 116 Cincinnati Reds 64 4.33 640 35 1436 544 90 Los Angeles Dodgers 92 3.44 667 47 1317 395 75 San Francisco Giants 84 3.72 696 41 1344 431 78 Arizona Diamondbacks 79 4.04 720 44 1450 500 86 Sand Diego Padres 74 4.09 650 41 1371 516 92 Colorado Rockies 68 5.04 737 36 1579 579 95Explanation / Answer
Regression Analysis
Data related to 2015 MLB team performance statistics has been collected. It is expected that fans will attend more games for those teams who tend to win more. Thus, the owners want to know what variables ultimately contribute to wins based on their team’s performance as they feel it will boost revenues and warrant higher compensation for their players.
Using multiple regression, correlation analysis, and residual analysis, examine each of the performance statistics and develop the best model to predict the number of wins with a 99% confidence level. (DO NOT include E.R.A in this model) Do you think this information appears to be helpful in predicting the number of wins? Summarize your results and methodology, and explain your perspective of the model you developed.
Solution:
Here, we have to develop the regression model for the prediction of the number of wins. The multiple regression model for the given data is given as below:
Regression Analysis
Regression Statistics
Multiple R
0.9526
R Square
0.9074
Adjusted R Square
0.8833
Standard Error
3.5711
Observations
30
ANOVA
df
SS
MS
F
Significance F
Regression
6
2875.6541
479.2757
37.5822
0.0000
Residual
23
293.3126
12.7527
Total
29
3168.9667
Coefficients
Standard Error
t Stat
P-value
Lower 99%
Upper 99%
Lower 99%
Upper 99%
Intercept
55.5223
17.3473
3.2006
0.0040
6.8226
104.2220
6.8226
104.2220
E.R.A.
-13.6275
4.0025
-3.4047
0.0024
-24.8639
-2.3911
-24.8639
-2.3911
Runs Scored
0.0736
0.0122
6.0253
0.0000
0.0393
0.1080
0.0393
0.1080
Saves
0.4852
0.1262
3.8449
0.0008
0.1309
0.8395
0.1309
0.8395
Hits Allowed
0.0118
0.0161
0.7360
0.4692
-0.0333
0.0570
-0.0333
0.0570
Walks Allowed
-0.0170
0.0213
-0.7968
0.4337
-0.0768
0.0428
-0.0768
0.0428
Errors
-0.0094
0.0517
-0.1810
0.8579
-0.1545
0.1358
-0.1545
0.1358
For this regression model, we get the p-value as 0.00 which is less than the given level of significance or alpha value 0.01, so we reject the null hypothesis that the given regression model is not significant. This means we conclude that the given regression model is significant.
The multiple correlation coefficient R is given as 0.9526, which means there is a high positive relationship or association exists between the dependent variables and independent variables. The coefficient of determination or the value of the R square is given as 0.9074 which means about 90.74% of the variation in the dependent variable number of wins is explained by the independent variables such as E.R.A., Runs scored, Saves, Hits allowed, Walks allowed and Errors.
The regression equation for the purpose of the prediction of the number of wins is given as below:
Number of wins = 55.52 – 13.63*ERA + 0.074*Runs scored + 0.4852*Saves + 0.0118*Hits allowed – 0.017*Walks allowed – 0.0094*Errors
The residuals for ERA are given as below:
E.R.A.
Residual
3.8
-1.316083459
4.03
-1.013601613
4.05
-0.437282353
3.74
-7.672907083
4.31
-1.861351227
3.73
2.69123492
4.07
-0.171881572
3.67
0.648172809
3.98
2.855736783
4.64
5.431768151
4.24
5.572629086
3.57
-0.999196568
3.94
4.902770273
4.16
-0.700106616
4.14
-3.179940287
3.43
-1.898740808
3.62
-4.015367916
4.02
1.204070906
4.41
0.890197333
4.69
-1.170103628
2.94
-0.289896042
3.21
1.0339156
3.36
6.06845392
4.28
-3.941351659
4.33
-3.56128697
3.44
3.237963179
3.72
0.148831741
4.04
-3.72256571
4.09
-0.166670363
5.04
1.432589173
Regression Analysis
Regression Statistics
Multiple R
0.9526
R Square
0.9074
Adjusted R Square
0.8833
Standard Error
3.5711
Observations
30
ANOVA
df
SS
MS
F
Significance F
Regression
6
2875.6541
479.2757
37.5822
0.0000
Residual
23
293.3126
12.7527
Total
29
3168.9667
Coefficients
Standard Error
t Stat
P-value
Lower 99%
Upper 99%
Lower 99%
Upper 99%
Intercept
55.5223
17.3473
3.2006
0.0040
6.8226
104.2220
6.8226
104.2220
E.R.A.
-13.6275
4.0025
-3.4047
0.0024
-24.8639
-2.3911
-24.8639
-2.3911
Runs Scored
0.0736
0.0122
6.0253
0.0000
0.0393
0.1080
0.0393
0.1080
Saves
0.4852
0.1262
3.8449
0.0008
0.1309
0.8395
0.1309
0.8395
Hits Allowed
0.0118
0.0161
0.7360
0.4692
-0.0333
0.0570
-0.0333
0.0570
Walks Allowed
-0.0170
0.0213
-0.7968
0.4337
-0.0768
0.0428
-0.0768
0.0428
Errors
-0.0094
0.0517
-0.1810
0.8579
-0.1545
0.1358
-0.1545
0.1358
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.