Studying Aging and Soft Error Mitigation Jointly under Constrained Scenarios in Multi-Cores

Florian Kriebel, Semeen Rehman, Muhammad Shafique

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Soft errors and aging are two substantial reliability-related problems in today's computing systems. While soft errors are transient in nature, they can lead to wrong application outputs and even application failures. However, aging has permanent effects and results in slower performance and timing errors in processors. In this paper, we will discuss different approaches how these reliability threats can be mitigated in multi-core systems for performance/power-constrained scenarios. Different research challenges and corresponding solutions will be described which jointly target the above-mentioned reliability issues while also considering additional problems like process variation and dark silicon. These solutions are implemented in different system layers and make use of an analysis of and adaptation to the application properties. Moreover, they leverage the opportunities offered by multi-core systems, for instance, to efficiently enable redundant multithreading.

Original languageEnglish (US)
Title of host publication2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design, IOLTS 2019
EditorsDimitris Gizopoulos, Dan Alexandrescu, Panagiota Papavramidou, Michail Maniatakos
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages139-142
Number of pages4
ISBN (Electronic)9781728124902
DOIs
StatePublished - Jul 2019
Event25th IEEE International Symposium on On-Line Testing and Robust System Design, IOLTS 2019 - Rhodes, Greece
Duration: Jul 1 2019Jul 3 2019

Publication series

Name2019 IEEE 25th International Symposium on On-Line Testing and Robust System Design, IOLTS 2019

Conference

Conference25th IEEE International Symposium on On-Line Testing and Robust System Design, IOLTS 2019
Country/TerritoryGreece
CityRhodes
Period7/1/197/3/19

Keywords

  • aging
  • dark silicon
  • fault-tolerance
  • heterogeneous
  • mapping
  • modeling
  • multi-core systems
  • optimization
  • performance
  • process variation
  • Reliability
  • scheduling
  • soft errors
  • task management

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Studying Aging and Soft Error Mitigation Jointly under Constrained Scenarios in Multi-Cores'. Together they form a unique fingerprint.

Cite this