Educational large-scale assessments often employ various different response formats. Objective scoring rules are an advantage of multiple-choice (MC) items over constructed response (CR) items. State-wide educational assessments are often coded by the teachers on location rather than centrally by trained raters. The present study investigates format-specific differential validity, using a mathematics competency assessment in the domain of numbers and operations. The analysis is based on a subsample of the “Lernstand 5” pilot study 2016 (n = 1205 fourth-graders). Using MC items and short answer (CR) items respectively, format-specific scales were created with the same number of items per scale and comparable reliabilities. Both format-specific scales showed good criterion validity (rMC = .57; rCR = .60) with the Mathematics grade. The short answer scale’s validity did not differ significantly from the MC scale’s validity (Δr = 0.03, p = .15). These results support the conclusion that teacher-coded assessments with both response formats can yield a valid measurement in educational large-scale assessments.
Translated title of the contributionPsychometric properties of multiple-choice and constructed response formats in proficiency tests
Original languageGerman
JournalPsychologie in Erziehung und Unterricht
Issue number4
Pages (from-to)260-272
Number of pages13
Publication statusPublished - 10.2019

    Research areas

  • response format, multiple-choice items, mathematics proficiency, validity, elementary school

ID: 932039