Comparing character-level neural language models using a lexical decision task

Gaël Le Godais, Tal Linzen, Emmanuel Dupoux

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    What is the information captured by neural network models of language? We address this question in the case of character-level recurrent neural language models. These models do not have explicit word representations; do they acquire implicit ones? We assess the lexical capacity of a network using the lexical decision task common in psycholinguistics: the system is required to decide whether or not a string of characters forms a word. We explore how accuracy on this task is affected by the architecture of the network, focusing on cell type (LSTM vs. SRN), depth and width. We also compare these architectural properties to a simple count of the parameters of the network. The overall number of parameters in the network turns out to be the most important predictor of accuracy; in particular, there is little evidence that deeper networks are beneficial for this task.

    Original languageEnglish (US)
    Title of host publicationShort Papers
    PublisherAssociation for Computational Linguistics (ACL)
    Pages125-130
    Number of pages6
    ISBN (Electronic)9781510838604
    DOIs
    StatePublished - 2017
    Event15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Valencia, Spain
    Duration: Apr 3 2017Apr 7 2017

    Publication series

    Name15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017 - Proceedings of Conference
    Volume2

    Other

    Other15th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2017
    CountrySpain
    CityValencia
    Period4/3/174/7/17

    ASJC Scopus subject areas

    • Linguistics and Language
    • Language and Linguistics

    Fingerprint Dive into the research topics of 'Comparing character-level neural language models using a lexical decision task'. Together they form a unique fingerprint.

    Cite this