Adversarial multitask learning for joint multi-feature and multi-dialect morphological modeling

Nasser Zalmout, Nizar Habash

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Morphological tagging is challenging for morphologically rich languages due to the large target space and the need for more training data to minimize model sparsity. Dialectal variants of morphologically rich languages suffer more as they tend to be more noisy and have less resources. In this paper we explore the use of multitask learning and adversarial training to address morphological richness and dialectal variations in the context of full morphological tagging. We use multitask learning for joint morphological modeling for the features within two dialects, and as a knowledge-transfer scheme for cross-dialectal modeling. We use adversarial training to learn dialect invariant features that can help the knowledge-transfer scheme from the high to low-resource variants. We work with two dialectal variants: Modern Standard Arabic (high-resource “dialect”1) and Egyptian Arabic (low-resource dialect) as a case study. Our models achieve state-of-the-art results for both. Furthermore, adversarial training provides more significant improvement when using smaller training datasets in particular.

Original languageEnglish (US)
Title of host publicationACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages1775-1786
Number of pages12
ISBN (Electronic)9781950737482
StatePublished - 2019
Event57th Annual Meeting of the Association for Computational Linguistics, ACL 2019 - Florence, Italy
Duration: Jul 28 2019Aug 2 2019

Publication series

NameACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference

Conference

Conference57th Annual Meeting of the Association for Computational Linguistics, ACL 2019
Country/TerritoryItaly
CityFlorence
Period7/28/198/2/19

ASJC Scopus subject areas

  • Language and Linguistics
  • General Computer Science
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Adversarial multitask learning for joint multi-feature and multi-dialect morphological modeling'. Together they form a unique fingerprint.

Cite this