TY - GEN
T1 - SoftSNN
T2 - 59th ACM/IEEE Design Automation Conference, DAC 2022
AU - Putra, Rachmad Vidya Wicaksana
AU - Hanif, Muhammad Abdullah
AU - Shafique, Muhammad
N1 - Funding Information:
This work was partly supported by Intel Corporation through Gift funding for the project “Cost-Effective Dependability for Deep Neural Networks and Spiking Neural Networks”, and by Indonesia Endowment Fund for Education (LPDP). This work was also jointly supported by the NYUAD Center for Interacting Urban Networks (CITIES), funded by Tamkeen under the NYUAD Research Institute Award CG001 and Center for CyberSecurity (CCS), funded by Tamkeen under the NYUAD Research Institute Award G1104.
Publisher Copyright:
© 2022 ACM.
PY - 2022/7/10
Y1 - 2022/7/10
N2 - Specialized hardware accelerators have been designed and employed to maximize the performance efficiency of Spiking Neural Networks (SNNs). However, such accelerators are vulnerable to transient faults (i.e., soft errors), which occur due to high-energy particle strikes, and manifest as bit flips at the hardware layer. These errors can change the weight values and neuron operations in the compute engine of SNN accelerators, thereby leading to incorrect outputs and accuracy degradation. However, the impact of soft errors in the compute engine and the respective mitigation techniques have not been thoroughly studied yet for SNNs. A potential solution is employing redundant executions (re-execution) for ensuring correct outputs, but it leads to huge latency and energy overheads. Toward this, we propose SoftSNN, a novel methodology to mitigate soft errors in the weight registers (synapses) and neurons of SNN accelerators without re-execution, thereby maintaining the accuracy with low latency and energy overheads. Our SoftSNN methodology employs the following key steps: (1) analyzing the SNN characteristics under soft errors to identify faulty weights and neuron operations, which are required for recognizing faulty SNN behavior; (2) a Bound-and-Protect technique that leverages this analysis to improve the SNN fault tolerance by bounding the weight values and protecting the neurons from faulty operations; and (3) devising lightweight hardware enhancements for the neural hardware accelerator to efficiently support the proposed technique. The experimental results show that, for a 900-neuron network with even a high fault rate, our SoftSNN maintains the accuracy degradation below 3%, while reducing latency and energy by up to 3x and 2.3x respectively, as compared to the re-execution technique.
AB - Specialized hardware accelerators have been designed and employed to maximize the performance efficiency of Spiking Neural Networks (SNNs). However, such accelerators are vulnerable to transient faults (i.e., soft errors), which occur due to high-energy particle strikes, and manifest as bit flips at the hardware layer. These errors can change the weight values and neuron operations in the compute engine of SNN accelerators, thereby leading to incorrect outputs and accuracy degradation. However, the impact of soft errors in the compute engine and the respective mitigation techniques have not been thoroughly studied yet for SNNs. A potential solution is employing redundant executions (re-execution) for ensuring correct outputs, but it leads to huge latency and energy overheads. Toward this, we propose SoftSNN, a novel methodology to mitigate soft errors in the weight registers (synapses) and neurons of SNN accelerators without re-execution, thereby maintaining the accuracy with low latency and energy overheads. Our SoftSNN methodology employs the following key steps: (1) analyzing the SNN characteristics under soft errors to identify faulty weights and neuron operations, which are required for recognizing faulty SNN behavior; (2) a Bound-and-Protect technique that leverages this analysis to improve the SNN fault tolerance by bounding the weight values and protecting the neurons from faulty operations; and (3) devising lightweight hardware enhancements for the neural hardware accelerator to efficiently support the proposed technique. The experimental results show that, for a 900-neuron network with even a high fault rate, our SoftSNN maintains the accuracy degradation below 3%, while reducing latency and energy by up to 3x and 2.3x respectively, as compared to the re-execution technique.
UR - http://www.scopus.com/inward/record.url?scp=85137420641&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85137420641&partnerID=8YFLogxK
U2 - 10.1145/3489517.3530657
DO - 10.1145/3489517.3530657
M3 - Conference contribution
AN - SCOPUS:85137420641
T3 - Proceedings - Design Automation Conference
SP - 151
EP - 156
BT - Proceedings of the 59th ACM/IEEE Design Automation Conference, DAC 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 10 July 2022 through 14 July 2022
ER -