DsReliM: Power-constrained reliability management in Dark-Silicon many-core chips under process variations

Mohammad Salehi, Muhammad Shafique, Florian Kriebel, Semeen Rehman, Mohammad Khavari Tavana, Alireza Ejlali, Jorg Henkel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Due to the tight power envelope, in the future technology nodes it is envisaged that not all cores in a many-core chip can be simultaneously powered-on (at full performance level). The power-gated cores are referred to as Dark Silicon. At the same time, growing reliability issues due to process variations and soft errors challenge the cost-effective deployment of future technology nodes. This paper presents a reliability management system for Dark Silicon chips (dsReliM) that optimizes for reliability of on-chip systems while jointly accounting for soft errors, process variations and the thermal design power (TDP) constraint. Towards the TDP-constrained reliability optimization, dsReliM leverages multiple reliable application versions that can potentially execute on different cores with frequency variations and supporting differenst voltage-frequency levels, thus facilitating distinct power, reliability and performance tradeoffs at run time. Experiments show that our dsReliM system provides up to 20% reliability improvements under different TDP constraints when compared to a state-of-the-art technique. Also, compared to an ideal-case optimal solution, dsReliM deviates up to 2.5% in terms of reliability efficiency, but speeds up the reliability management decision time by a factor of up to 3100.

Original languageEnglish (US)
Title of host publication2015 International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages75-82
Number of pages8
ISBN (Electronic)9781467383219
DOIs
StatePublished - Nov 17 2015
EventInternational Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2015 - Amsterdam, Netherlands
Duration: Oct 4 2015Oct 9 2015

Publication series

Name2015 International Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2015

Conference

ConferenceInternational Conference on Hardware/Software Codesign and System Synthesis, CODES+ISSS 2015
Country/TerritoryNetherlands
CityAmsterdam
Period10/4/1510/9/15

Keywords

  • constrained-optimization
  • Dark silicon
  • many-core
  • power-efficiency
  • process variation
  • reliability
  • soft errors

ASJC Scopus subject areas

  • Hardware and Architecture
  • Software
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'DsReliM: Power-constrained reliability management in Dark-Silicon many-core chips under process variations'. Together they form a unique fingerprint.

Cite this