Research question

This paper examines the overarching question of to what extent different analytic choices may influence the inference about country-specific cross-sectional and trend estimates in international large-scale assessments. We take data from the assessment of PISA mathematics proficiency from the four rounds from 2003 to 2012 as a case study.


In particular, four key methodological factors are considered as analytical choices in the rescaling and analysis of the data: (1) The selection of country sub-samples for item calibration differing at three factor levels. (2) The item sample refering to two sets of mathematics items used within PISA. (3) The estimation method used for item calibration: marginal maximum likelihood estimation method as implemented in R package TAM or an pairwise row averaging approach as implemented in the R package pairwise. (4) The type of linking method: concurrent calibration or separate calibration with successive chain linking.


It turned out that analytical decisions for scaling did affect the PISA outcomes. The factors of choosing different calibration samples, estimation method and linking method tend to show only small effects on the country-specific cross-sectional and trend estimates. However, the selection of different link items seems to have a decisive influence on country ranking and development trends between and within countries.
Original languageEnglish
Article number10
JournalLarge-scale Assessments in Education
Number of pages29
Publication statusPublished - 27.08.2022
No renderer: handleNetPortal,dk.atira.pure.api.shared.model.researchoutput.ContributionToJournal

    Research areas

  • Methodological research and method development - Large-scale assessment, PISA, Mathematics item sampling, Trend estimate, Estimation method, linking, Scaling, Cross-sectional estimate

ID: 3787518