One of the primary goals of international large-scale assessments (ILSAs) in education is the comparison of country means in student achievement. The present article introduces a framework for discussing differential item functioning (DIF) for country comparisons in ILSAs. Three different linking methods are compared: concurrent calibration based on full invariance, concurrent calibration based on partial invariance using the MD or RMSD statistics, and separate calibration with subsequent nonrobust and robust linking approaches. Furthermore, we show analytically the bias in country means of different linking methods in the presence of DIF. In a simulation study, we show that partial invariance and robust linking approaches provide less biased country mean estimates than the full invariance approach in the case of biased items. Some guidelines are derived for the selection of cutoff values for the MD and RMSD statistics in the partial invariance approach.
ZeitschriftPsychological Test and Assessment Modeling
Seiten (von - bis)233-279
PublikationsstatusVeröffentlicht - 06.2020
No renderer: handleNetPortal,dk.atira.pure.api.shared.model.researchoutput.ContributionToJournal


ID: 1400685