APNAS: Accuracy-and-performance-aware neural architecture search for neural hardware accelerators

Paniti Achararit, Muhammad Abdullah Hanif, Rachmad Vidya Wicaksana Putra, Muhammad Shafique, Yuko Hara-Azumi

Research output: Contribution to journalArticlepeer-review


Designing resource-efficient deep neural networks (DNNs) is a challenging task due to the enormous diversity of applications as well as their time-consuming design, training, optimization, and evaluation cycles, especially the resource-constrained embedded systems. To address these challenges, we propose a novel DNN design framework called accuracy-and-performance-aware neural architecture search (APNAS), which can generate DNNs efficiently, as it does not require hardware devices or simulators while searching for optimized DNN model configurations that offer both inference accuracy and high execution performance. In addition, to accelerate the process of DNN generation, APNAS is built on a weight sharing and reinforcement learning-based exploration methodology, which is composed of a recurrent neural network controller as its core to generate sample DNN configurations. The reward in reinforcement learning is formulated as a configurable function to consider the sample DNNs' accuracy and cycle count required to run on a target hardware architecture. To further expedite the DNN generation process, we devise analytical models for cycle count estimation instead of running millions of DNN configurations on real hardware. We demonstrate that these analytical models are highly accurate and provide cycle count estimates identical to those of a cycle-accurate hardware simulator. Experiments that involve quantitatively varying hardware constraints demonstrate that APNAS requires only 0.55 graphics processing unit (GPU) days on a single Nvidia GTX 1080Ti GPU to generate DNNs that offer an average of 53% fewer cycles with negligible accuracy degradation (on average 3%) for image classification compared to state-of-the-art techniques.

Original languageEnglish (US)
Pages (from-to)165319-165334
Number of pages16
JournalIEEE Access
StatePublished - 2020


  • Accelerator
  • Accuracy
  • CNN
  • Convolutional neural network
  • DNN
  • Deep learning
  • Deep neural networks
  • Efficiency
  • Embedded systems
  • Machine learning
  • Neural architecture search
  • Neural processing arrays
  • Performance

ASJC Scopus subject areas

  • General Computer Science
  • General Materials Science
  • General Engineering


Dive into the research topics of 'APNAS: Accuracy-and-performance-aware neural architecture search for neural hardware accelerators'. Together they form a unique fingerprint.

Cite this