What Makes Reading Comprehension Questions Difficult?

Saku Sugawara, Nikita Nangia, Alex Warstadt, Samuel R. Bowman

    Research output: Chapter in Book/Report/Conference proceedingConference contribution


    For a natural language understanding benchmark to be useful in research, it has to consist of examples that are diverse and difficult enough to discriminate among current and near-future state-of-the-art systems. However, we do not yet know how best to select text sources to collect a variety of challenging examples. In this study, we crowdsource multiple-choice reading comprehension questions for passages taken from seven qualitatively distinct sources, analyzing what attributes of passages contribute to the difficulty and question types of the collected examples. To our surprise, we find that passage source, length, and readability measures do not significantly affect question difficulty. Through our manual annotation of seven reasoning types, we observe several trends between passage sources and reasoning types, e.g., logical reasoning is more often required in questions written for technical passages. These results suggest that when creating a new benchmark dataset, selecting a diverse set of passages can help ensure a diverse range of question types, but that passage difficulty need not be a priority.

    Original languageEnglish (US)
    Title of host publicationACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
    EditorsSmaranda Muresan, Preslav Nakov, Aline Villavicencio
    PublisherAssociation for Computational Linguistics (ACL)
    Number of pages21
    ISBN (Electronic)9781955917216
    StatePublished - 2022
    Event60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, Ireland
    Duration: May 22 2022May 27 2022

    Publication series

    NameProceedings of the Annual Meeting of the Association for Computational Linguistics
    ISSN (Print)0736-587X


    Conference60th Annual Meeting of the Association for Computational Linguistics, ACL 2022

    ASJC Scopus subject areas

    • Computer Science Applications
    • Linguistics and Language
    • Language and Linguistics


    Dive into the research topics of 'What Makes Reading Comprehension Questions Difficult?'. Together they form a unique fingerprint.

    Cite this