Using curvature information to improve back-propagation

Research output: Contribution to journalConference articlepeer-review

Abstract

Among all the supervised learning algorithms, back-propagation (BP) is probably the most wi(l)dely used. Classical non-linear programming methods generally use an estimate of the Hessian matrix (matrix of second derivatives) to compute the weight modification at each iteration. They are derived from the well known Newton-Raphson algorithm. We propose a very rough approximation to the Newton method which just uses the diagonal terms of the Hessian matrix. These terms give information about the curvature of the error surface in directions parallel to the weight space axes. This information can be used to scale the learning rates for each weight independently. We show that it is possible to approximate the diagonal terms of the Hessian matrix using a back-propagation procedure very similar to the one used for the first derivatives.

Original languageEnglish (US)
Pages (from-to)168
Number of pages1
JournalNeural Networks
Volume1
Issue number1 SUPPL
DOIs
StatePublished - 1988
EventInternational Neural Network Society 1988 First Annual Meeting - Boston, MA, USA
Duration: Sep 6 1988Sep 10 1988

ASJC Scopus subject areas

  • Cognitive Neuroscience
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Using curvature information to improve back-propagation'. Together they form a unique fingerprint.

Cite this