The topic for discussion in the Forum of Unit 6 are mathematical models and how
ID: 3061260 • Letter: T
Question
The topic for discussion in the Forum of Unit 6 are mathematical models and how good they should fit reality. In this Forum we would like to return to the same topic subject, but consider it specifically in the context of statistical models.
Some statisticians prefer complex models, models that try to fit the data as closely as one can. Others prefer a simple model. They claim that although simpler models are more remote from the data yet they are easier to interpret and thus provide more insight.
What do you think? Which type of model is best to use?
When formulating your answer to this question you may think of a situation that involves inference that you do and need to present to other people. Would the consumers of your analysis benefit more from you having used a complex model of from yo having used a simpler model? What would be the best way to report your findings and explain them to the consumers?
Explanation / Answer
I think the preference to complex or simple models depends on the perspective of the study. For example if we are analyzing the data for policy makers where we need simple answers like whether death rate in North Carolina has increased over time, or whether more gun laws can reduce number of homicides we should aim for simple models where direct answers to these questions can be provided.
On the other hand if we are trying to understand the mechanism of a random variable and try to build a predictive model I think more complex model should get the preference as in nature nothing is static and neither the processes work linearly. For example if we want to classify the customers based on their account and credit details as good or bad borrower many features should be included in the model and then some complex classification method like random forest etc. can be employed to get the best classifier which in long run will provide us a risk assessment of the customers.
In practice there should be a balance between simple model and complex model: the trade-off is called parsimony. This is the reason people in statistics now use LASSO kind of thing where you keep on adding features to the model but at the same time put a penalty so that total number of effective features will not exceed a given threshold.
Hope this helps.
Related Questions
drjack9650@gmail.com
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.