TY - JOUR
T1 - Combinatorial auctions for temperature-constrained resource management in manycores
AU - Khdr, Heba
AU - Shafique, Muhammad
AU - Pagani, Santiago
AU - Herkersdorf, Andreas
AU - Henkel, Jorg
N1 - Funding Information:
This work was supported by the Deutsche Forschungsge-meinschaft (DFG, German Research Foundation) – Projekt-nummer 146371743 – TRR 89 “Invasive Computing”.
Publisher Copyright:
© 1990-2012 IEEE.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/7/1
Y1 - 2020/7/1
N2 - Although manycore processors have plenty of cores, not all of them may run simultaneously at full speed and even some of them might need to be power-gated in order to keep the chip within safe temperature limits. Hence, a resource management technique, that allocates cores to application aiming at maximizing the system performance, will not be able to achieve its goal without taking into account the on-chip temperature and its impact on the availability of the chip's resources. However, considering a temperature constraint by the resource management will further increase its complexity, especially in manycores, and thus implementing it in a centralized scheme might lead to a computation bottleneck and a single point of failure. To avoid such scenarios, it is inevitable to distribute the computation required by the resource management technique throughout the chip. In this article, we propose a distributed resource management technique that considers temperature as an essential factor in allocating cores to applications and determining the power states of these cores and their voltage/frequency levels, while taking into account the performance models of the applications in order to maximize the overall system performance under a temperature constraint. Our proposed technique employs, for the first time, combinatorial auctions within an agent system to achieve the targeted goal in a distributed manner. The experimental evaluations show that our proposed technique achieves significant performance improvements with an average of 41% compared to several distributed resource management techniques.
AB - Although manycore processors have plenty of cores, not all of them may run simultaneously at full speed and even some of them might need to be power-gated in order to keep the chip within safe temperature limits. Hence, a resource management technique, that allocates cores to application aiming at maximizing the system performance, will not be able to achieve its goal without taking into account the on-chip temperature and its impact on the availability of the chip's resources. However, considering a temperature constraint by the resource management will further increase its complexity, especially in manycores, and thus implementing it in a centralized scheme might lead to a computation bottleneck and a single point of failure. To avoid such scenarios, it is inevitable to distribute the computation required by the resource management technique throughout the chip. In this article, we propose a distributed resource management technique that considers temperature as an essential factor in allocating cores to applications and determining the power states of these cores and their voltage/frequency levels, while taking into account the performance models of the applications in order to maximize the overall system performance under a temperature constraint. Our proposed technique employs, for the first time, combinatorial auctions within an agent system to achieve the targeted goal in a distributed manner. The experimental evaluations show that our proposed technique achieves significant performance improvements with an average of 41% compared to several distributed resource management techniques.
KW - Dvfs
KW - Performance optimization
KW - Runtime resource management
KW - System-level optimization
KW - Temperature-aware design
UR - http://www.scopus.com/inward/record.url?scp=85081173406&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081173406&partnerID=8YFLogxK
U2 - 10.1109/TPDS.2020.2965523
DO - 10.1109/TPDS.2020.2965523
M3 - Article
AN - SCOPUS:85081173406
SN - 1045-9219
VL - 31
SP - 1605
EP - 1620
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
IS - 7
M1 - 8955960
ER -