Contrastive Learning to Improve Retrieval for Real-world Fact Checking

Aniruddh Sriram, Fangyuan Xu, Eunsol Choi, Greg Durrett

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent work on fact-checking addresses a realistic setting where models incorporate evidence retrieved from the web to decide the veracity of claims. A bottleneck in this pipeline is in retrieving relevant evidence: traditional methods may surface documents directly related to a claim, but fact-checking complex claims requires more inferences. For instance, a document about how a vaccine was developed is relevant to addressing claims about what it might contain, even if it does not address them directly. We present Contrastive Fact-Checking Reranker (CFR), an improved retriever for this setting. By leveraging the AVeriTeC dataset, which annotates subquestions for claims with human written answers from evidence documents, we fine-tune Contriever with a contrastive objective based on multiple training signals, including distillation from GPT-4, evaluating subquestion answers, and gold labels in the dataset. We evaluate our model on both retrieval and end-to-end veracity judgments about claims. On the AVeriTeC dataset, we find a 6% improvement in veracity classification accuracy. We also show our gains can be transferred to FEVER, ClaimDecomp, HotpotQA, and a synthetic dataset requiring retrievers to make inferences.

Original languageEnglish (US)
Title of host publicationFEVER 2024 - 7th Fact Extraction and VERification Workshop, Proceedings of the Workshop
EditorsMichael Schlichtkrull, Yulong Chen, Chenxi Whitehouse, Zhenyun Deng, Mubashara Akhtar, Rami Aly, Zhijiang Guo, Christos Christodoulopoulos, Oana Cocarascu, Arpit Mittal, James Thorne, Andreas Vlachos
PublisherAssociation for Computational Linguistics (ACL)
Pages264-279
Number of pages16
ISBN (Electronic)9798891761728
StatePublished - 2024
Event7th Fact Extraction and VERification Workshop, FEVER 2024 - Miami, United States
Duration: Nov 15 2024 → …

Publication series

NameFEVER 2024 - 7th Fact Extraction and VERification Workshop, Proceedings of the Workshop

Conference

Conference7th Fact Extraction and VERification Workshop, FEVER 2024
Country/TerritoryUnited States
CityMiami
Period11/15/24 → …

ASJC Scopus subject areas

  • Language and Linguistics
  • Artificial Intelligence
  • Software
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Contrastive Learning to Improve Retrieval for Real-world Fact Checking'. Together they form a unique fingerprint.

Cite this