Analyzing and mitigating the impact of permanent faults on a systolic array based neural network accelerator

Jeff Zhang, Tianyu Gu, Kanad Basu, Siddharth Garg

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Due to their growing popularity and computational cost, deep neural networks (DNNs) are being targeted for hardware acceleration. A popular architecture for DNN acceleration, adopted by the Google Tensor Processing Unit (TPU), utilizes a systolic array based matrix multiplication unit at its core. This paper deals with the design of fault-tolerant, systolic array based DNN accelerators for high defect rate technologies. To this end, we empirically show that the classification accuracy of a baseline TPU drops significantly even at extremely low fault rates (as low as 0.006%). We then propose two novel strategies, fault-aware pruning (FAP) and fault-aware pruning+retraining (FAP+T), that enable the TPU to operate at fault rates of up to 50%, with negligible drop in classification accuracy (as low as 0.1%) and no run-time performance overhead. The FAP+T does introduce a one-time retraining penalty per TPU chip before it is deployed, but we propose optimizations that reduce this one-time penalty to under 12 minutes. The penalty is then amortized over the entire lifetime of the TPU's operation.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE 36th VLSI Test Symposium, VTS 2018
PublisherIEEE Computer Society
Number of pages6
ISBN (Electronic)9781538637746
StatePublished - May 29 2018
Event36th IEEE VLSI Test Symposium, VTS 2018 - San Francisco, United States
Duration: Apr 22 2018Apr 25 2018

Publication series

NameProceedings of the IEEE VLSI Test Symposium


Other36th IEEE VLSI Test Symposium, VTS 2018
Country/TerritoryUnited States
CitySan Francisco

ASJC Scopus subject areas

  • Computer Science Applications
  • Electrical and Electronic Engineering


Dive into the research topics of 'Analyzing and mitigating the impact of permanent faults on a systolic array based neural network accelerator'. Together they form a unique fingerprint.

Cite this