TY - GEN
T1 - Minerva
T2 - 43rd International Symposium on Computer Architecture, ISCA 2016
AU - Reagen, Brandon
AU - Whatmough, Paul
AU - Adolf, Robert
AU - Rama, Saketh
AU - Lee, Hyunkwang
AU - Lee, Sae Kyu
AU - Hernandez-Lobato, Jose Miguel
AU - Wei, Gu Yeon
AU - Brooks, David
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/24
Y1 - 2016/8/24
N2 - The continued success of Deep Neural Networks (DNNs) in classification tasks has sparked a trend of accelerating their execution with specialized hardware. While published designs easily give an order of magnitude improvement over general-purpose hardware, few look beyond an initial implementation. This paper presents Minerva, a highly automated co-design approach across the algorithm, architecture, and circuit levels to optimize DNN hardware accelerators. Compared to an established fixed-point accelerator baseline, we show that fine-grained, heterogeneous data type optimization reduces power by 1.5, aggressive, in-line predication and pruning of small activity values further reduces power by 2.0, and active hardware fault detection coupled with domain-aware error mitigation eliminates an additional 2.7 through lowering SRAM voltages. Across five datasets, these optimizations provide a collective average of 8.1 power reduction over an accelerator baseline without compromising DNN model accuracy. Minerva enables highly accurate, ultra-low power DNN accelerators (in the range of tens of milliwatts), making it feasible to deploy DNNs in power-constrained IoT and mobile devices.
AB - The continued success of Deep Neural Networks (DNNs) in classification tasks has sparked a trend of accelerating their execution with specialized hardware. While published designs easily give an order of magnitude improvement over general-purpose hardware, few look beyond an initial implementation. This paper presents Minerva, a highly automated co-design approach across the algorithm, architecture, and circuit levels to optimize DNN hardware accelerators. Compared to an established fixed-point accelerator baseline, we show that fine-grained, heterogeneous data type optimization reduces power by 1.5, aggressive, in-line predication and pruning of small activity values further reduces power by 2.0, and active hardware fault detection coupled with domain-aware error mitigation eliminates an additional 2.7 through lowering SRAM voltages. Across five datasets, these optimizations provide a collective average of 8.1 power reduction over an accelerator baseline without compromising DNN model accuracy. Minerva enables highly accurate, ultra-low power DNN accelerators (in the range of tens of milliwatts), making it feasible to deploy DNNs in power-constrained IoT and mobile devices.
UR - http://www.scopus.com/inward/record.url?scp=84988349874&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84988349874&partnerID=8YFLogxK
U2 - 10.1109/ISCA.2016.32
DO - 10.1109/ISCA.2016.32
M3 - Conference contribution
AN - SCOPUS:84988349874
T3 - Proceedings - 2016 43rd International Symposium on Computer Architecture, ISCA 2016
SP - 267
EP - 278
BT - Proceedings - 2016 43rd International Symposium on Computer Architecture, ISCA 2016
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 June 2016 through 22 June 2016
ER -