Deep neural networks (DNNs) have proliferated in most of the application domains that involve data processing, predictive analysis and knowledge inference. Alongside the need for developing highly performance-efficient DNN accelerators, there is an utmost need to improve the yield of the manufacturing process in order to reduce the per unit cost of the DNN accelerators. To this end, we present 'SalvageDNN', a methodology to enable reliable execution of DNNs on the hardware accelerators with permanent faults (typically due to imperfect manufacturing processes). It employs a fault-aware mapping of different parts of a given DNN on the hardware accelerator (subjected to faults) by leveraging the saliency of the DNN parameters and the fault map of the underlying processing hardware. We also present novel modifications in a systolic array design to further improve the yield of the accelerators while ensuring reliable DNN execution using 'SalvageDNN' and negligible overheads in terms of area, power/energy and performance. This article is part of the theme issue 'Harmonizing energy-autonomous computing and intelligence'.
|Original language||English (US)|
|Journal||Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences|
|State||Published - Feb 7 2020|
- Neural networks
- Reliable computing
ASJC Scopus subject areas
- Physics and Astronomy(all)