SEER: Auto-generating information extraction rules from user-specified examples

Maeda F. Hanafi, Azza Abouzied, Laura Chiticariu, Yunyao Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Time-consuming and complicated best describe the current state of the Information Extraction (IE) field. Machine learning approaches to IE require large collections of labeled datasets that are difficult to create and use obscure mathematical models, occasionally returning unwanted results that are unexplainable. Rule-based approaches, while resulting in easy-to-understand IE rules, are still time-consuming and labor-intensive. SEER combines the best of these two approaches: a learning model for IE rules based on a small number of user-specified examples. In this paper, we explain the design behind SEER and present a user study comparing our system against a commercially available tool in which users create IE rules manually. Our results show that SEER helps users complete text extraction tasks more quickly, as well as more accurately.

Original languageEnglish (US)
Title of host publicationCHI 2017 - Proceedings of the 2017 ACM SIGCHI Conference on Human Factors in Computing Systems
Subtitle of host publicationExplore, Innovate, Inspire
PublisherAssociation for Computing Machinery
Pages6672-6682
Number of pages11
ISBN (Electronic)9781450346559
DOIs
StatePublished - May 2 2017
Event2017 ACM SIGCHI Conference on Human Factors in Computing Systems, CHI 2017 - Denver, United States
Duration: May 6 2017May 11 2017

Publication series

NameConference on Human Factors in Computing Systems - Proceedings
Volume2017-May

Other

Other2017 ACM SIGCHI Conference on Human Factors in Computing Systems, CHI 2017
Country/TerritoryUnited States
CityDenver
Period5/6/175/11/17

Keywords

  • Data extraction
  • Example-driven learning

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Graphics and Computer-Aided Design

Fingerprint

Dive into the research topics of 'SEER: Auto-generating information extraction rules from user-specified examples'. Together they form a unique fingerprint.

Cite this