TY - GEN
T1 - Optimal self-recovering microarchitecture synthesis
AU - Karri, Ramesh
AU - Orailoglu, Alex
PY - 1993
Y1 - 1993
N2 - In this paper, we propose a novel ILP model for the scheduling problem in self-recovering microarchitecture synthesis. A self-recovering microarchitecture, on detecting a (transient) fault, rolls back to a previous known correct state -the checkpoint- and retires the computation. The maximum distance between adjacent checkpoints -the retry period- is determined by the transient fault rate as well as the average lifetime of a transient fault. At a checkpoint, the results of intermediate computations are compared (using voters), and if correct saved in registers. Consequently, associated with each checkpoint, there is a time overhead due to comparison and an area overhead due to the fault-tolerant nature of the voters. Firstly, we formulate time-constrained scheduling as minimizing either the number of voters or the overall hardware, subject to constraints on the number of clock cycles, the retry period, and the number of checkpoints. Moreover, we develop a model for resource-constrained scheduling wherein both the overall system performance as well as the recovery time overhead are optimized subject to hardware constraints.
AB - In this paper, we propose a novel ILP model for the scheduling problem in self-recovering microarchitecture synthesis. A self-recovering microarchitecture, on detecting a (transient) fault, rolls back to a previous known correct state -the checkpoint- and retires the computation. The maximum distance between adjacent checkpoints -the retry period- is determined by the transient fault rate as well as the average lifetime of a transient fault. At a checkpoint, the results of intermediate computations are compared (using voters), and if correct saved in registers. Consequently, associated with each checkpoint, there is a time overhead due to comparison and an area overhead due to the fault-tolerant nature of the voters. Firstly, we formulate time-constrained scheduling as minimizing either the number of voters or the overall hardware, subject to constraints on the number of clock cycles, the retry period, and the number of checkpoints. Moreover, we develop a model for resource-constrained scheduling wherein both the overall system performance as well as the recovery time overhead are optimized subject to hardware constraints.
UR - http://www.scopus.com/inward/record.url?scp=0027846231&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0027846231&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:0027846231
SN - 0818636823
T3 - Digest of Papers - International Symposium on Fault-Tolerant Computing
SP - 512
EP - 521
BT - Digest of Papers - International Symposium on Fault-Tolerant Computing
A2 - Anon, null
PB - Publ by IEEE
T2 - Proceedings of the 23rd International Symposium on Fault-Tolerant Computing
Y2 - 22 June 1993 through 24 June 1993
ER -