For each of parts a) through d), indicate whether we would generally expect the
ID: 3181736 • Letter: F
Question
For each of parts a) through d), indicate whether we would generally expect the performance of a flexible statistical learning method to be better or worse than an inflexible method. Justify your answer. a) The sample size n is extremely large, and the number of predictors p is small. b) The number of predictors p is extremely large, and the number of observations n is small. c) The relationship between the predictors and response is highly non-linear. d) The variance of the error terms, i.e. Var(epsilon), is extremely high.Explanation / Answer
a) Better.
If the sample size is large and the number of predictors are small ( i.e. n >> p, n = no. of obs, p = no. of predictors) , then we have sufficient information about each predictors. So a flexible method will fit the data well and because of the larger sample size, will perform better than an inflexible approach.
b) Worse.
If the number of predictors is large and number of observations is small ( i.e. p >> n) then, we will not get sufficient information about the effect and variation of each parameters considered. So in order to study the parameters we need to draw samples until it is more than the number of parameters. In this case flexible method will overfit the model due to smaller number of observations.
c) Better.
In this case, if the relationship between the predictors and response is highly non-linear, flexible method will fit the data better with more degrees of freedom than an inflexible method.
b) Worse.
The fkexible will incorporate the noise in error terms and thus it will increase the variance. So the model fitting will be poor.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.