High-performance chip multiprocessors contain numerous parallel-processing cores where a fabric devised as a network-on-chip (NoC) efficiently handles their escalating intertile communication demands. Unfortunately, prolonged operational stresses cause accelerated physically induced wearout leading to permanent metal wire faults in links. Where only a subset of wires may malfunction, enduring healthy wires are leveraged to sustain connectivity when a partially faulty link recovery mechanism is utilized, where its data recovery latency overhead is proportional to the number of consecutive faulty wires. With NoC link failure models being ultimately important, albeit being absent from existing literature, the construction of a mathematical model towards the understanding of the distribution of wire faults in parallel on-chip links is very critical. This paper steps in such a direction, where the objective is to find the probability of having a "fault segment" consisting of a certain number of consecutive "faulty" wires in a parallel NoC link. First, it is shown how the given problem can be reduced to an equivalent combinatorial problem through partitions and necklaces. Then the proposed algorithm counts certain classes of necklaces by making a separation between periodic and aperiodic cases. Finally, the resulting analytical model is tested successfully against a far more costly brute-force algorithm.
ASJC Scopus subject areas