Multiple imputation is a widely recommended means of addressing the problem of missing data in psychological research. An often-neglected requirement of this approach is that the imputation model used to generate the imputed values must be at least as general as the analysis model. For multilevel designs in which lower level units (e.g., students) are nested within higher level units (e.g., classrooms), this means that the multilevel structure must be taken into account in the imputation model. In the present article, we compare different strategies for multiply imputing incomplete multilevel data using mathematical derivations and computer simulations. We show that ignoring the multilevel structure in the imputation may lead to substantial negative bias in estimates of intraclass correlations as well as biased estimates of regression coefficients in multilevel models. We also demonstrate that an ad hoc strategy that includes dummy indicators in the imputation model to represent the multilevel structure may be problematic under certain conditions (e.g., small groups, low intraclass correlations). Imputation based on a multivariate linear mixed effects model was the only strategy to produce valid inferences under most of the conditions investigated in the simulation study. Data from an educational psychology research project are also used to illustrate the impact of the various multiple imputation strategies.

Original language | English |
---|---|

Journal | Psychological Methods |

Volume | 22 |

Issue | 1 |

Pages (from-to) | 141-165 |

Number of pages | 25 |

ISSN | 1082-989X |

DOIs | |

State | Published - 2017 |

**Methodological research and development**- missing data, multiple imputation, multilevel modeling, multilevel data, intraclass correlation

ID: 622857