In dealing with large data sets, addressing missing values is an important step.

ID: 3352686 • Letter: I

Question

In dealing with large data sets, addressing missing values is an important step. But, some datasets contain variables that have a large amount of missing values. In other words, several rows of the dataset have missing values. In such cases, dropping the variable with missing values will lead to a loss of significant data. Imputing the missing values might also be useless, as these imputations will be based on a small number of records. In such cases, what alternatives can you suggest when modeling from such data?

Explanation / Answer

General steps for analysis with missing data:

1) Identify patterns / reasons for missing and recode correctly.

2) Understand distribution of missing data.

3) Decide on best method of analysis.

Understanding of data :

. Attrition due to social/natural processes.

.Skip pattern in survey.

.Intentional missing as part of data collection process.

. Random data collection issues.

. Respondent refusal / Non-response.

Missing data mechanism or Probability distribution of missingness:

Consider the Probability of missingness.

Are certain groups more likely to have missing values

Are certain responses more likely to missing?

Missing data mechanism:

1)Missing completely at random:

Missing value neither depends on any value.

2) Missing at random:

Missing value depends on value.

3) Missing not at Random:

The Probability of a missing value depends on the variable that is missing.

Exploring missing data mechanism:

Can't be sure about Probability of missing values.

Some methods we can use, some are:

Selection model.

Pattern mixture models.

Deal with missing data:

Use what you Know about.

Decide on the best analysis strategy to yield the least biased estimates :

Deletion Methods.

Single imputation methods.

Model bases methods.

Navigate

In de novo phospholipid biosynthesis, the activated intermediate is CDP-diacylgl

In dealing with large data sets, addressing missing values is an important step.

Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.

In dealing with large data sets, addressing missing values is an important step.

Question

Explanation / Answer

Related Questions

Navigate