Structural sentence similarity estimation for short texts

Weicheng Ma, Torsten Suel

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Sentence similarity is the basis of most text-related tasks. In this paper, we define a new task of sentence similarity estimation specifically for short while informal, social-network styled sentences. The new type of sentence similarity, which we call Structural Similarity, eliminates syntactic or grammatical features such as dependency paths and Part-of-Speech (POS) tagging which do not have enough representativeness on short sentences. Structural Similarity does not consider actual meanings of the sentences either but puts more emphasis on the similarities of sentence structures, so as to discover purpose- or emotion-level similarities. The idea is based on the observation that people tend to use sentences with similar structures to express similar feelings. Besides the definition, we present a new feature set and a mechanism to calculate the scores, and, for the needs of disambiguating word senses we propose a variant of the Word2Vec model to represent words. We prove the correctness and advancement of our sentence similarity measurement by experiments.

    Original languageEnglish (US)
    Title of host publicationProceedings of the 29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016
    EditorsZdravko Markov, Ingrid Russell
    PublisherAAAI press
    Pages232-237
    Number of pages6
    ISBN (Electronic)9781577357568
    StatePublished - 2016
    Event29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016 - Key Largo, United States
    Duration: May 16 2016May 18 2016

    Publication series

    NameProceedings of the 29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016

    Other

    Other29th International Florida Artificial Intelligence Research Society Conference, FLAIRS 2016
    Country/TerritoryUnited States
    CityKey Largo
    Period5/16/165/18/16

    ASJC Scopus subject areas

    • Software
    • Artificial Intelligence
    • Computer Networks and Communications

    Fingerprint

    Dive into the research topics of 'Structural sentence similarity estimation for short texts'. Together they form a unique fingerprint.

    Cite this