Hermes: Architecting a top-performing fault-tolerant routing algorithm for Networks-on-Chips

Costas Iordanou, Vassos Soteriou, Konstantinos Aisopos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Networks-on-Chips (NoCs) are experiencing escalating susceptibility to wear-out and reduced reliability, with the risk of becoming the key point of failure in an entire multicore chip. Aiming towards seamless NoC operation in the presence of faulty communication links, in this paper we propose Hermes, a highly-robust, distributed and lightweight fault-tolerant routing algorithm, whose performance degrades gracefully with increasing faulty link counts. Hermes is a deadlock-free hybrid routing algorithm, utilizing load-balancing routing on fault-free paths to sustain high-performance, while providing pre-reconfigured escape path selection in the vicinity of faults. Additionally, Hermes identifies non-communicating network partitions in scenarios where faulty links are topologically densely distributed. An extensive experimental evaluation, including utilizing traffic benchmarks gathered from full-system chip multi-processor simulations, shows that Hermes improves network throughput by up to 3× when compared against prior-art.

Original languageEnglish (US)
Title of host publication2014 32nd IEEE International Conference on Computer Design, ICCD 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages424-431
Number of pages8
ISBN (Electronic)9781479964925
DOIs
StatePublished - Dec 3 2014
Event32nd IEEE International Conference on Computer Design, ICCD 2014 - Seoul, Korea, Republic of
Duration: Oct 19 2014Oct 22 2014

Publication series

Name2014 32nd IEEE International Conference on Computer Design, ICCD 2014

Other

Other32nd IEEE International Conference on Computer Design, ICCD 2014
Country/TerritoryKorea, Republic of
CitySeoul
Period10/19/1410/22/14

Keywords

  • Network-on-chip
  • chip multi-processor
  • fault-tolerance
  • reliability
  • routing algorithm

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Hermes: Architecting a top-performing fault-tolerant routing algorithm for Networks-on-Chips'. Together they form a unique fingerprint.

Cite this