TY - GEN
T1 - Exponent monitoring for low-cost concurrent error detection in FPU control logic
AU - Maniatakos, Michail
AU - Makris, Yiorgos
AU - Kudva, Prabhakar
AU - Fleischer, Bruce
PY - 2011
Y1 - 2011
N2 - We present a non-intrusive concurrent error detection (CED) method for protecting the control logic of a contemporary floating point unit (FPU). The proposed method is based on the observation that control logic errors lead to extensive datapath corruption and affect, with high probability, the exponent part of the IEEE 754 floating point representation. Thus, exponent monitoring can be utilized to detect errors in the control logic of the FPU. Predicting the exponent involves relatively simple operations, therefore our method incurs significantly lower overhead than the classical approach of duplicating the control logic of the FPU. Indeed, experimental results on the openSPARC T1 processor show that, as compared to control logic duplication, which incurs an area overhead of 17.9% of the FPU size, our method incurs an area overhead of only 5.8% yet still achieves detection of over 95% of transient errors in the FPU control logic. Moreover, the proposed method offers the ancillary benefit of also detecting 98.1% of datapath errors that affect the exponent, which cannot be detected via duplication of control logic. Finally, when combined with a classical residue code-based method for the fraction, our method leads to a complete CED solution for the entire FPU which provides a coverage of 94.4% of all errors at an area cost of 16.32% of the FPU size.
AB - We present a non-intrusive concurrent error detection (CED) method for protecting the control logic of a contemporary floating point unit (FPU). The proposed method is based on the observation that control logic errors lead to extensive datapath corruption and affect, with high probability, the exponent part of the IEEE 754 floating point representation. Thus, exponent monitoring can be utilized to detect errors in the control logic of the FPU. Predicting the exponent involves relatively simple operations, therefore our method incurs significantly lower overhead than the classical approach of duplicating the control logic of the FPU. Indeed, experimental results on the openSPARC T1 processor show that, as compared to control logic duplication, which incurs an area overhead of 17.9% of the FPU size, our method incurs an area overhead of only 5.8% yet still achieves detection of over 95% of transient errors in the FPU control logic. Moreover, the proposed method offers the ancillary benefit of also detecting 98.1% of datapath errors that affect the exponent, which cannot be detected via duplication of control logic. Finally, when combined with a classical residue code-based method for the fraction, our method leads to a complete CED solution for the entire FPU which provides a coverage of 94.4% of all errors at an area cost of 16.32% of the FPU size.
UR - http://www.scopus.com/inward/record.url?scp=79959652695&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79959652695&partnerID=8YFLogxK
U2 - 10.1109/VTS.2011.5783727
DO - 10.1109/VTS.2011.5783727
M3 - Conference contribution
AN - SCOPUS:79959652695
SN - 9781612846552
T3 - Proceedings of the IEEE VLSI Test Symposium
SP - 235
EP - 240
BT - Proceedings - 2011 29th IEEE VLSI Test Symposium, VTS 2011
T2 - 2011 29th IEEE VLSI Test Symposium, VTS 2011
Y2 - 1 May 2011 through 5 May 2011
ER -