The exploratory labeling assistant: Mixed-initiative label curation with large document collections

Cristian Felix, Aritra Dasgupta, Enrico Bertini

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    In this paper, we define the concept of exploratory labeling: the use of computational and interactive methods to help analysts categorize groups of documents into a set of unknown and evolving labels. While many computational methods exist to analyze data and build models once the data is organized around a set of predefined categories or labels, few methods address the problem of reliably discovering and curating such labels in the first place. In order to move first steps towards bridging this gap, we propose an interactive visual data analysis method that integrates human-driven label ideation, specification and refinement with machine-driven recommendations. The proposed method enables the user to progressively discover and ideate labels in an exploratory fashion and specify rules that can be used to automatically match sets of documents to labels. To support this process of ideation, specification, as well as evaluation of the labels, we use unsupervised machine learning methods that provide suggestions and data summaries. We evaluate our method by applying it to a real-world labeling problem as well as through controlled user studies to identify and reflect on patterns of interaction emerging from exploratory labeling activities.

    Original languageEnglish (US)
    Title of host publicationUIST 2018 - Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology
    PublisherAssociation for Computing Machinery, Inc
    Pages153-164
    Number of pages12
    ISBN (Electronic)9781450359481
    DOIs
    StatePublished - Oct 11 2018
    Event31st Annual ACM Symposium on User Interface Software and Technology, UIST 2018 - Berlin, Germany
    Duration: Oct 14 2018Oct 17 2018

    Publication series

    NameUIST 2018 - Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology

    Other

    Other31st Annual ACM Symposium on User Interface Software and Technology, UIST 2018
    CountryGermany
    CityBerlin
    Period10/14/1810/17/18

    Keywords

    • Document labeling
    • Exploratory labeling
    • Text analysis
    • Visualization

    ASJC Scopus subject areas

    • Human-Computer Interaction
    • Computer Graphics and Computer-Aided Design
    • Software

    Fingerprint Dive into the research topics of 'The exploratory labeling assistant: Mixed-initiative label curation with large document collections'. Together they form a unique fingerprint.

    Cite this