Recurrent orthogonal networks and long-memory tasks

Mikael Henaff, Arthur Szlam, Yann Lecun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter & Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions fur-thermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.

Original languageEnglish (US)
Title of host publication33rd International Conference on Machine Learning, ICML 2016
EditorsKilian Q. Weinberger, Maria Florina Balcan
PublisherInternational Machine Learning Society (IMLS)
Pages2978-2986
Number of pages9
ISBN (Electronic)9781510829008
StatePublished - 2016
Event33rd International Conference on Machine Learning, ICML 2016 - New York City, United States
Duration: Jun 19 2016Jun 24 2016

Publication series

Name33rd International Conference on Machine Learning, ICML 2016
Volume5

Other

Other33rd International Conference on Machine Learning, ICML 2016
Country/TerritoryUnited States
CityNew York City
Period6/19/166/24/16

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Recurrent orthogonal networks and long-memory tasks'. Together they form a unique fingerprint.

Cite this