TY - GEN
T1 - Cherry-picking
T2 - 16th Design, Automation and Test in Europe Conference and Exhibition, DATE 2013
AU - Raghunathan, Bharathwaj
AU - Turakhia, Yatish
AU - Garg, Siddharth
AU - Marculescu, Diana
PY - 2013
Y1 - 2013
N2 - It is projected that increasing on-chip integration with technology scaling will lead to the so-called dark silicon era in which more transistors are available on a chip than can be simultaneously powered on. It is conventionally assumed that the dark silicon will be provisioned with heterogeneous resources, for example dedicated hardware accelerators. In this paper we challenge the conventional assumption and build a case for homogeneous dark silicon CMPs that exploit the inherent variations in process parameters that exist in scaled technologies to offer increased performance. Since process variations result in core-to-core variations in power and frequency, the idea is to cherry pick the best subset of cores for an application so as to maximize performance within the power budget. To this end, we propose a polynomial time algorithm for optimal core selection, thread mapping and frequency assignment for a large class of multi-threaded applications. Our experimental results based on the Sniper multi-core simulator show that up to 22% and 30% performance improvement is observed for homogeneous CMPs with 33% and 50% dark silicon, respectively.
AB - It is projected that increasing on-chip integration with technology scaling will lead to the so-called dark silicon era in which more transistors are available on a chip than can be simultaneously powered on. It is conventionally assumed that the dark silicon will be provisioned with heterogeneous resources, for example dedicated hardware accelerators. In this paper we challenge the conventional assumption and build a case for homogeneous dark silicon CMPs that exploit the inherent variations in process parameters that exist in scaled technologies to offer increased performance. Since process variations result in core-to-core variations in power and frequency, the idea is to cherry pick the best subset of cores for an application so as to maximize performance within the power budget. To this end, we propose a polynomial time algorithm for optimal core selection, thread mapping and frequency assignment for a large class of multi-threaded applications. Our experimental results based on the Sniper multi-core simulator show that up to 22% and 30% performance improvement is observed for homogeneous CMPs with 33% and 50% dark silicon, respectively.
UR - http://www.scopus.com/inward/record.url?scp=84885641092&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84885641092&partnerID=8YFLogxK
U2 - 10.7873/date.2013.023
DO - 10.7873/date.2013.023
M3 - Conference contribution
AN - SCOPUS:84885641092
SN - 9783981537000
T3 - Proceedings -Design, Automation and Test in Europe, DATE
SP - 39
EP - 44
BT - Proceedings - Design, Automation and Test in Europe, DATE 2013
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 18 March 2013 through 22 March 2013
ER -