An Institute of Sport Study has collected data on the characteristics of athlete
ID: 3064437 • Letter: A
Question
An Institute of Sport Study has collected data on the characteristics of athletes in different sports.
Four sport disciplines were selected for this study: Basketball, Curling, Swimming and Track and Field.
Data on 30 athletes was collected and the following variables were considered:
Ht - Height in cm
Wt - Weight in kg
RCC - Red cell count
WCC - White cell count
Hc - The ratio of the volume of red blood cells to the total volume of blood
Hg - The amount of hemoglobin in whole blood expressed in grams per deciliter (g/dl)
Ferr - Plasma ferritin concentration
Sport Discipline - Basketball, Curling, Swimming and Track and Field
Gender - Female, Male
BFP - the body fat percentage = total mass of fat divided by the total body mass*100
As a research assistant for this Institute, you were asked to build a model to predict the body fat percentage BFP of the athletes based on the data provided. As a first step, you need to prepare the data for analysis.
a) Derive the multiple regression model to predict the body fat percentage (BFP) using the backward elimination method. Use 5% level of significance. Interpret the coefficients
b) Which predictor is the most significant to predict BFP?
c) How much of the variation in BFP can be explained by this regression model?
d) Perform a residual analysis
e) Redo questions a) to c) using the Best Subset Regression. Which model is better?
Ht Wt RCC WCC Hc Hg Ferr Sport Gender BFP 195.9 78.9 3.96 7.5 37.5 12.3 60 BSK female 19.75 189.7 74.4 4.41 8.3 38.2 12.7 68 BSK female 21.3 177.9 80.5 4.26 6.2 41 13.9 48 Curling female 17.71 179.6 70.5 4.36 5.8 40.3 13.3 29 Curling female 19.83 176.8 59.9 4.56 13.3 42.2 13.6 20 BSK female 11.29 172.6 63 4.15 6 38 12.7 59 BSK female 25.26 176 66.3 4.16 7.6 37.5 12.3 22 BSK female 19.39 170 64.8 4.36 5.5 41.4 13.8 82 Swim female 14.52 170 59 4.07 5.9 39.5 13.3 25 Swim female 11.47 180.5 72.1 4.17 4.9 38.9 12.9 86 Swim female 17.71 171.1 78.9 4.81 6.8 42.7 15.3 50 Track female 20.1 172.7 83.9 4.51 9 39.7 14.3 36 Track female 24.88 171.6 74.4 5.33 9.3 47 15 62 Track female 19.51 172.7 67 5.13 7.1 46.8 15.9 34 Swim male 8.47 176.5 74.4 4.83 7.6 45.2 15.2 97 Swim male 7.68 183 79.3 5.09 4.7 46.6 15.9 55 Swim male 6.16 190.7 85.7 4.87 8.2 43.8 15 130 Curling male 9 181.8 85.4 5.04 7.1 44 14.8 64 Curling male 12.61 200.4 92.2 5.24 7.2 46.6 15.9 58 BSK male 7.35 195.3 78.9 4.54 5.9 44.4 15.6 97 BSK male 7.16 194.1 90.3 5.13 5.8 46.1 15.9 110 BSK male 8.77 187.9 87 5 6.7 45.3 15.7 72 BSK male 9.56 185.1 102.7 5.09 8.9 46.3 15.4 44 Track male 13.97 185.5 94.25 5.11 9.6 48.2 16.7 103 Track male 11.66 190.5 79.6 5.16 12.9 47.6 15.6 156 Swim male 9.89 191 85.3 5.29 12.7 48 16.2 124 Swim male 13.06 179.9 76.3 4.86 8.9 46.9 15.8 65 Swim male 10.25 183.9 93.2 4.9 7.6 45.6 16 90 Swim male 11.79 183.5 80 5.66 8.3 50.2 17.7 38 Track male 10.05 183.1 73.8 5.03 6.4 42.7 14.3 122 Track male 8.51Explanation / Answer
Given that,
Data on 30 athletes was collected and the following variables were considered:
Ht - Height in cm
Wt - Weight in kg
RCC - Red cell count
WCC - White cell count
Hc - The ratio of the volume of red blood cells to the total volume of blood
Hg - The amount of hemoglobin in whole blood expressed in grams per deciliter (g/dl)
Ferr - Plasma ferritin concentration
Sport Discipline - Basketball, Curling, Swimming and Track and Field
Gender - Female, Male
BFP - the body fat percentage = total mass of fat divided by the total body mass*100
This is the multiple regression problem.
We can do multiple regression in XL Stat.
steps :
ENTER data into excel sheet --> XLSTAT --> Modeling data --> Linear regression --> Y --> Quantitative variable : select all the numerical variables --> X --> Quantitative variables : select all the numerical variables --> Variable labels --> Options --> Model selection : Backward --> ok
a) Derive the multiple regression model to predict the body fat percentage (BFP) using the backward elimination method. Use 5% level of significance. Interpret the coefficients
If we fix gender then one unit change in wt will be 0.199 unit increase in BFP.
If we fix Wt then one unit change in gender will be 11.413 unit decrease in BFP.
b) Which predictor is the most significant to predict BFP?
Wt and gender are significant variables because P-value for both is less than 0.05.
c) How much of the variation in BFP can be explained by this regression model?
Rsq = 0.755 = 75.5%
It expresses the proprotion of variation in y which is explained by variation in x.
e) Redo questions a) to c) using the Best Subset Regression. Which model is better?
We will do same steps as a) only here we have to consider best model.
0.755
Both the models are significant because F statistic is significant at both procedures.
BFP = 15.8869934332609+0.199172587833335*Wt - 11.41263342452*GenderRelated Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.