Generalization Measures for Zero-Shot Cross-Lingual Transfer

Saksham Bassi, Duygu Ataman, Kyunghyun Cho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Building robust and reliable machine learning systems requires models with the capacity to generalize their knowledge to interpret unseen inputs with different characteristics. Traditional language model evaluation tasks lack informative metrics about model generalization, and their applicability in new settings is often measured using task and language-specific downstream performance, which is lacking in many languages and tasks. To address this gap, we explore a set of efficient and reliable measures that could aid in computing more information related to the generalization capability of language models, particularly in cross-lingual zero-shot settings. Our central hypothesis is that the sharpness of a model’s loss landscape, i.e., the representation of loss values over its weight space, can indicate its generalization potential, with a flatter landscape suggesting better generalization. We propose a novel and stable algorithm to reliably compute the sharpness of a model optimum, and demonstrate its correlation with successful cross-lingual transfer.

Original languageEnglish (US)
Title of host publicationMRL 2024 - 4th Workshop on Multilingual Representation Learning, Proceedings of the Workshop
EditorsJonne Saleva, Abraham Owodunni
PublisherAssociation for Computational Linguistics (ACL)
Pages298-309
Number of pages12
ISBN (Electronic)9798891761841
StatePublished - 2024
Event4th Workshop on Multilingual Representation Learning, MRL 2024 - Miami, United States
Duration: Nov 16 2024 → …

Publication series

NameMRL 2024 - 4th Workshop on Multilingual Representation Learning, Proceedings of the Workshop

Conference

Conference4th Workshop on Multilingual Representation Learning, MRL 2024
Country/TerritoryUnited States
CityMiami
Period11/16/24 → …

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Generalization Measures for Zero-Shot Cross-Lingual Transfer'. Together they form a unique fingerprint.

Cite this