Seeking the real item difficulty: bias-corrected item difficulty and some consequences in Rasch and IRT modeling
: Metsämuuronen Jari
Publisher: Behaviormetric Society of Japan
: 2022
: Behaviormetrika
: 1349-6964
DOI: https://doi.org/10.1007/s41237-022-00169-9(external)
: https://link.springer.com/article/10.1007/s41237-022-00169-9(external)
: https://research.utu.fi/converis/portal/detail/Publication/176017020(external)
When the response pattern in a test item deviates from the deterministic pattern, the percentage of correct answers (p) is shown to be a biased estimator for the latent item difficulty (π). This is specifically true with the items of medium item difficulty. Four elements of impurities in p are formalized in the binary settings and four new estimators of π are proposed and studied. Algebraic reasons and a simulation suggest that, except the case of deterministic item discrimination, the real item difficulty is almost always more extreme than what p indicates. This characteristic of p to be biased toward a medium-leveled item difficulty has a strict consequence to item response theory (IRT) and Rasch modeling. Because the classical estimator of item difficulty p is a biased estimator of the latent difficulty level, the item parameters A and B and the person parameter θ within IRT modeling are, consequently, biased estimators of item discrimination and item difficulty as well as ability levels of the test takers.