TY - JOUR
T1 - Item Response Models for Multiple Attempts With Incomplete Data
AU - Bergner, Yoav
AU - Choi, Ikkyu
AU - Castellano, Katherine E.
N1 - Publisher Copyright:
© 2019 by the National Council on Measurement in Education
PY - 2019/6/1
Y1 - 2019/6/1
N2 - Allowance for multiple chances to answer constructed response questions is a prevalent feature in computer-based homework and exams. We consider the use of item response theory in the estimation of item characteristics and student ability when multiple attempts are allowed but no explicit penalty is deducted for extra tries. This is common practice in online formative assessments, where the number of attempts is often unlimited. In these environments, some students may not always answer-until-correct, but may rather terminate a response process after one or more incorrect tries. We contrast the cases of graded and sequential item response models, both unidimensional models which do not explicitly account for factors other than ability. These approaches differ not only in terms of log-odds assumptions but, importantly, in terms of handling incomplete data. We explore the consequences of model misspecification through a simulation study and with four online homework data sets. Our results suggest that model selection is insensitive for complete data, but quite sensitive to whether missing responses are regarded as informative (of inability) or not (e.g., missing at random). Under realistic conditions, a sequential model with similar parametric degrees of freedom to a graded model can account for more response patterns and outperforms the latter in terms of model fit.
AB - Allowance for multiple chances to answer constructed response questions is a prevalent feature in computer-based homework and exams. We consider the use of item response theory in the estimation of item characteristics and student ability when multiple attempts are allowed but no explicit penalty is deducted for extra tries. This is common practice in online formative assessments, where the number of attempts is often unlimited. In these environments, some students may not always answer-until-correct, but may rather terminate a response process after one or more incorrect tries. We contrast the cases of graded and sequential item response models, both unidimensional models which do not explicitly account for factors other than ability. These approaches differ not only in terms of log-odds assumptions but, importantly, in terms of handling incomplete data. We explore the consequences of model misspecification through a simulation study and with four online homework data sets. Our results suggest that model selection is insensitive for complete data, but quite sensitive to whether missing responses are regarded as informative (of inability) or not (e.g., missing at random). Under realistic conditions, a sequential model with similar parametric degrees of freedom to a graded model can account for more response patterns and outperforms the latter in terms of model fit.
UR - http://www.scopus.com/inward/record.url?scp=85066803646&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066803646&partnerID=8YFLogxK
U2 - 10.1111/jedm.12214
DO - 10.1111/jedm.12214
M3 - Article
AN - SCOPUS:85066803646
SN - 0022-0655
VL - 56
SP - 415
EP - 436
JO - Journal of Educational Measurement
JF - Journal of Educational Measurement
IS - 2
ER -