ALWANN: Automatic layer-wise approximation of deep neural network accelerators without retraining

Vojtech Mrazek, Zdenek Vasicek, Lukas Sekanina, Muhammad Abdullah Hanif, Muhammad Shafique

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The state-of-the-art approaches employ approximate computing to reduce the energy consumption of DNN hardware. Approximate DNNs then require extensive retraining afterwards to recover from the accuracy loss caused by the use of approximate operations. However, retraining of complex DNNs does not scale well. In this paper, we demonstrate that efficient approximations can be introduced into the computational path of DNN accelerators while retraining can completely be avoided. ALWANN provides highly optimized implementations of DNNs for custom low-power accelerators in which the number of computing units is lower than the number of DNN layers. First, a fully trained DNN (e.g., in TensorFlow) is converted to operate with 8-bit weights and 8-bit multipliers in convolutional layers. A suitable approximate multiplier is then selected for each computing element from a library of approximate multipliers in such a way that (i) one approximate multiplier serves several layers, and (ii) the overall classification error and energy consumption are minimized. The optimizations including the multiplier selection problem are solved by means of a multiobjective optimization NSGA-II algorithm. In order to completely avoid the computationally expensive retraining of DNN, which is usually employed to improve the classification accuracy, we propose a simple weight updating scheme that compensates the inaccuracy introduced by employing approximate multipliers. The proposed approach is evaluated for two architectures of DNN accelerators with approximate multipliers from the open-source 'EvoApprox' library, while executing three versions of ResNet on CIFAR-10. We report that the proposed approach saves 30% of energy needed for multiplication in convolutional layers of ResNet-50 while the accuracy is degraded by only 0.6% (0.9% for the ResNet-14). The proposed technique and approximate layers are available as an open-source extension of TensorFlow at https://github.com/ehw-fit/tf-approximate.

Original languageEnglish (US)
Title of host publication2019 IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2019 - Digest of Technical Papers
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728123509
DOIs
StatePublished - Nov 2019
Event38th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2019 - Westin Westminster, United States
Duration: Nov 4 2019Nov 7 2019

Publication series

NameIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
Volume2019-November
ISSN (Print)1092-3152

Conference

Conference38th IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2019
CountryUnited States
CityWestin Westminster
Period11/4/1911/7/19

Keywords

  • Approximate computing
  • CIFAR-10
  • Computational path
  • Deep neural networks
  • ResNet

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Graphics and Computer-Aided Design

Fingerprint Dive into the research topics of 'ALWANN: Automatic layer-wise approximation of deep neural network accelerators without retraining'. Together they form a unique fingerprint.

Cite this