Efficient gradient computation for structured output learning with rational and tropical losses

Corinna Cortes, Vitaly Kuznetsov, Mehryar Mohri, Dmitry Storcheus, Scott Yang

Research output: Contribution to journalConference articlepeer-review

Abstract

Many structured prediction problems admit a natural loss function for evaluation such as the edit-distance or n-gram loss. However, existing learning algorithms are typically designed to optimize alternative objectives such as the cross-entropy. This is because a naïve implementation of the natural loss functions often results in intractable gradient computations. In this paper, we design efficient gradient computation algorithms for two broad families of structured prediction loss functions: rational and tropical losses. These families include as special cases the n-gram loss, the edit-distance loss, and many other loss functions commonly used in natural language processing and computational biology tasks that are based on sequence similarity measures. Our algorithms make use of weighted automata and graph operations over appropriate semirings to design efficient solutions. They facilitate efficient gradient computation and hence enable one to train learning models such as neural networks with complex structured losses.

Original languageEnglish (US)
Pages (from-to)6810-6821
Number of pages12
JournalAdvances in Neural Information Processing Systems
Volume2018-December
StatePublished - 2018
Event32nd Conference on Neural Information Processing Systems, NeurIPS 2018 - Montreal, Canada
Duration: Dec 2 2018Dec 8 2018

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Efficient gradient computation for structured output learning with rational and tropical losses'. Together they form a unique fingerprint.

Cite this