Deeply-supervised nets

Chen Yu Lee, Saining Xie, Patrick W. Gallagher, Zhengyou Zhang, Zhuowen Tu

Research output: Contribution to journalConference articlepeer-review

Abstract

We propose deeply-supervised nets (DSN), a method that simultaneously minimizes classification error and improves the directness and transparency of the hidden layer learning process. We focus our attention on three aspects of traditional convolutional-neural-network-type (CNN-type) architectures: (1) transparency in the effect intermediate layers have on overall classification; (2) discrimina-tiveness and robustness of learned features, especially in early layers; (3) training effectiveness in the face of "vanishing" gradients. To combat these issues, we introduce "companion" objective functions at each hidden layer, in addition to the overall objective function at the output layer (an integrated strategy distinct from layer-wise pre-training). We also analyze our algorithm using techniques extended from stochastic gradient methods. The advantages provided by our method are evident in our experimental results, showing state-of-the-art performance on MNIST, CIFAR-10, CIFAR-100, and SVHN.

Original languageEnglish (US)
Pages (from-to)562-570
Number of pages9
JournalJournal of Machine Learning Research
Volume38
StatePublished - 2015
Event18th International Conference on Artificial Intelligence and Statistics, AISTATS 2015 - San Diego, United States
Duration: May 9 2015May 12 2015

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Deeply-supervised nets'. Together they form a unique fingerprint.

Cite this