The reversible residual network: Backpropagation without storing activations

Aidan N. Gomez, Mengye Ren, Raquel Urtasun, Roger B. Grosse

Research output: Contribution to journalConference articlepeer-review

Abstract

Deep residual networks (ResNets) have significantly pushed forward the state-of-the-art on image classification, increasing in performance as networks grow both deeper and wider. However, memory consumption becomes a bottleneck, as one needs to store the activations in order to calculate gradients using backpropagation. We present the Reversible Residual Network (RevNet), a variant of ResNets where each layer's activations can be reconstructed exactly from the next layer's. Therefore, the activations for most layers need not be stored in memory during backpropagation. We demonstrate the effectiveness of RevNets on CIFAR-10, CIFAR-100, and ImageNet, establishing nearly identical classification accuracy to equally-sized ResNets, even though the activation storage requirements are independent of depth.

Original languageEnglish (US)
Pages (from-to)2215-2225
Number of pages11
JournalAdvances in Neural Information Processing Systems
Volume2017-December
StatePublished - 2017
Event31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States
Duration: Dec 4 2017Dec 9 2017

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'The reversible residual network: Backpropagation without storing activations'. Together they form a unique fingerprint.

Cite this