Gathering and Generating Paraphrases from Twitter with Application to Normalization

Wei Xu, Alan Ritter, Ralph Grishman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a new and unique paraphrase resource, which contains meaningpreserving transformations between informal user-generated text. Sentential paraphrases are extracted from a comparable corpus of temporally and topically related messages on Twitter which often express semantically identical information through distinct surface forms. We demonstrate the utility of this new resource on the task of paraphrasing and normalizing noisy text, showing improvement over several state-of-the-art paraphrase and normalization systems 1.

Original languageEnglish (US)
Title of host publication6th Workshop on Building and Using Comparable Corpora, BUCC 2013 at the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013 - Proceedings
EditorsSerge Sharoff, Pierre Zweigenbaum, Reinhard Rapp, Reinhard Rapp
PublisherAssociation for Computational Linguistics (ACL)
Pages121-128
Number of pages8
ISBN (Electronic)9781937284602
StatePublished - 2013
Event6th Workshop on Building and Using Comparable Corpora, BUCC 2013 at the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013 - Sofia, Bulgaria
Duration: Aug 8 2013 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
ISSN (Print)0736-587X

Conference

Conference6th Workshop on Building and Using Comparable Corpora, BUCC 2013 at the 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013
Country/TerritoryBulgaria
CitySofia
Period8/8/13 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Gathering and Generating Paraphrases from Twitter with Application to Normalization'. Together they form a unique fingerprint.

Cite this