Standard

Causal inference with multilevel data : A comparison of different propensity score weighting approaches. / Fuentes, Alvaro; Lüdtke, Oliver; Robitzsch, Alexander.

In: Multivariate Behavioral Research, 15.06.2021.

Research output: Contribution to journalJournal articleResearchpeer-review

Harvard

APA

Vancouver

Author

Fuentes, Alvaro ; Lüdtke, Oliver ; Robitzsch, Alexander. / Causal inference with multilevel data : A comparison of different propensity score weighting approaches. In: Multivariate Behavioral Research. 2021.

BibTeX

@article{53426581069c45e4bb5af9b592547d25,
title = "Causal inference with multilevel data: A comparison of different propensity score weighting approaches",
abstract = "Propensity score methods are a widely recommended approach to adjust for confounding and to recover treatment effects with non-experimental, single-level data. This article reviews propensity score weighting estimators for multilevel data in which individuals (level 1) are nested in clusters (level 2) and nonrandomly assigned to either a treatment or control condition at level 1. We address the choice of a weighting strategy (inverse probability weights, trimming, overlap weights, calibration weights) and discuss key issues related to the specification of the propensity score model (fixed-effects model, multilevel random-effects model) in the context of multilevel data. In three simulation studies, we show that estimates based on calibration weights, which prioritize balancing the sample distribution of level-1 and (unmeasured) level-2 covariates, should be preferred under many scenarios (i.e., treatment effect heterogeneity, presence of strong level-2 confounding) and can accommodate covariate-by-cluster interactions. However, when level-1 covariate effects vary strongly across clusters (i.e., under random slopes), and this variation is present in both the treatment and outcome data-generating mechanisms, large cluster sizes are needed to obtain accurate estimates of the treatment effect. We also discuss the implementation of survey weights and present a real-data example that illustrates the different methods.",
keywords = "Methodological research and method development, Causal inference, propensity scores, multilevel data, weighting, calibration weights",
author = "Alvaro Fuentes and Oliver L{\"u}dtke and Alexander Robitzsch",
year = "2021",
month = jun,
day = "15",
doi = "10.1080/00273171.2021.1925521",
language = "English",
journal = "Multivariate Behavioral Research",
issn = "0027-3171",
publisher = "Taylor and Francis Ltd.",

}

RIS

TY - JOUR

T1 - Causal inference with multilevel data

T2 - A comparison of different propensity score weighting approaches

AU - Fuentes, Alvaro

AU - Lüdtke, Oliver

AU - Robitzsch, Alexander

PY - 2021/6/15

Y1 - 2021/6/15

N2 - Propensity score methods are a widely recommended approach to adjust for confounding and to recover treatment effects with non-experimental, single-level data. This article reviews propensity score weighting estimators for multilevel data in which individuals (level 1) are nested in clusters (level 2) and nonrandomly assigned to either a treatment or control condition at level 1. We address the choice of a weighting strategy (inverse probability weights, trimming, overlap weights, calibration weights) and discuss key issues related to the specification of the propensity score model (fixed-effects model, multilevel random-effects model) in the context of multilevel data. In three simulation studies, we show that estimates based on calibration weights, which prioritize balancing the sample distribution of level-1 and (unmeasured) level-2 covariates, should be preferred under many scenarios (i.e., treatment effect heterogeneity, presence of strong level-2 confounding) and can accommodate covariate-by-cluster interactions. However, when level-1 covariate effects vary strongly across clusters (i.e., under random slopes), and this variation is present in both the treatment and outcome data-generating mechanisms, large cluster sizes are needed to obtain accurate estimates of the treatment effect. We also discuss the implementation of survey weights and present a real-data example that illustrates the different methods.

AB - Propensity score methods are a widely recommended approach to adjust for confounding and to recover treatment effects with non-experimental, single-level data. This article reviews propensity score weighting estimators for multilevel data in which individuals (level 1) are nested in clusters (level 2) and nonrandomly assigned to either a treatment or control condition at level 1. We address the choice of a weighting strategy (inverse probability weights, trimming, overlap weights, calibration weights) and discuss key issues related to the specification of the propensity score model (fixed-effects model, multilevel random-effects model) in the context of multilevel data. In three simulation studies, we show that estimates based on calibration weights, which prioritize balancing the sample distribution of level-1 and (unmeasured) level-2 covariates, should be preferred under many scenarios (i.e., treatment effect heterogeneity, presence of strong level-2 confounding) and can accommodate covariate-by-cluster interactions. However, when level-1 covariate effects vary strongly across clusters (i.e., under random slopes), and this variation is present in both the treatment and outcome data-generating mechanisms, large cluster sizes are needed to obtain accurate estimates of the treatment effect. We also discuss the implementation of survey weights and present a real-data example that illustrates the different methods.

KW - Methodological research and method development

KW - Causal inference

KW - propensity scores

KW - multilevel data

KW - weighting

KW - calibration weights

U2 - 10.1080/00273171.2021.1925521

DO - 10.1080/00273171.2021.1925521

M3 - Journal article

JO - Multivariate Behavioral Research

JF - Multivariate Behavioral Research

SN - 0027-3171

ER -

ID: 1616226