ParaGraph: Mapping Wikidata Tail Entities to Wikipedia Paragraphs

Natalia Ostapuk, Djellel Difallah, Philippe Cudre-Mauroux

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Bridging unstructured data with knowledge bases is an essential task in many problems related to natural language understanding. Traditionally, this task is considered in one direction only: linking entity mentions in a text to their counterpart in a knowledge base (also known as entity linking). In this paper, we propose to tackle this problem from a different angle: linking entities from a knowledge base to paragraphs describing those entities. We argue that such a new perspective can be beneficial to several applications, including information retrieval, knowledge base population, and joint entity and word embedding. We present a transformer-based model, ParaGraph, which, given a Wikidata entity as input, retrieves its corresponding Wikipedia section. To perform this task, ParaGraph first generates an entity summary and compares it to sections to select an initial set of candidates. The candidates are then ranked using additional information from the entity's textual description and contextual information. Our experimental results show that ParaGraph achieves 87% Hits@10 when ranking Wikipedia sections given a Wikidata entity as input. The obtained results show that ParaGraph can reduce the information gap between Wikipedia-based entities and tail entities and demonstrate the effectiveness of our proposed approach towards linking knowledge graph entities to their text counterparts.

Original languageEnglish (US)
Title of host publicationProceedings - 2022 IEEE International Conference on Big Data, Big Data 2022
EditorsShusaku Tsumoto, Yukio Ohsawa, Lei Chen, Dirk Van den Poel, Xiaohua Hu, Yoichi Motomura, Takuya Takagi, Lingfei Wu, Ying Xie, Akihiro Abe, Vijay Raghavan
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6008-6017
Number of pages10
ISBN (Electronic)9781665480451
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Big Data, Big Data 2022 - Osaka, Japan
Duration: Dec 17 2022Dec 20 2022

Publication series

NameProceedings - 2022 IEEE International Conference on Big Data, Big Data 2022

Conference

Conference2022 IEEE International Conference on Big Data, Big Data 2022
Country/TerritoryJapan
CityOsaka
Period12/17/2212/20/22

Keywords

  • Entity Linking
  • Knowledge Graphs
  • Linked Data

ASJC Scopus subject areas

  • Modeling and Simulation
  • Computer Networks and Communications
  • Information Systems
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Control and Optimization

Fingerprint

Dive into the research topics of 'ParaGraph: Mapping Wikidata Tail Entities to Wikipedia Paragraphs'. Together they form a unique fingerprint.

Cite this