TY - GEN
T1 - Global management of cache hierarchies
AU - Zahran, Mohamed
AU - McKee, Sally A.
N1 - Copyright:
Copyright 2010 Elsevier B.V., All rights reserved.
PY - 2010
Y1 - 2010
N2 - Cache memories currently treat all blocks as if they were equally important. This assumption of equally important blocks is not always valid. For instance, not all blocks deserve to be in L1 cache. We therefore propose globalized block placement. We present a global placement algorithm for managing blocks in a cache hierarchy by deciding where in the hierarchy an incoming block should be placed. Our technique makes decisions by adapting to access patterns of different blocks. The contributions of this paper are fourfold. First, we motivate our solution by demonstrating the importance of a globalized placement scheme. Second, we present a method to categorize cache block behavior into one of four categories. Third, we present one potential design exploiting this categorization. Finally, we demonstrate the performance of our design. The proposed scheme enhances overall system performance (IPC) by an average of 12% over a traditional LRU scheme while reducing traffic between L1 cache and L2 cache by an average of 20%, using SPEC CPU benchmark suite. All of this is achieved with a table as small as 3 KB.
AB - Cache memories currently treat all blocks as if they were equally important. This assumption of equally important blocks is not always valid. For instance, not all blocks deserve to be in L1 cache. We therefore propose globalized block placement. We present a global placement algorithm for managing blocks in a cache hierarchy by deciding where in the hierarchy an incoming block should be placed. Our technique makes decisions by adapting to access patterns of different blocks. The contributions of this paper are fourfold. First, we motivate our solution by demonstrating the importance of a globalized placement scheme. Second, we present a method to categorize cache block behavior into one of four categories. Third, we present one potential design exploiting this categorization. Finally, we demonstrate the performance of our design. The proposed scheme enhances overall system performance (IPC) by an average of 12% over a traditional LRU scheme while reducing traffic between L1 cache and L2 cache by an average of 20%, using SPEC CPU benchmark suite. All of this is achieved with a table as small as 3 KB.
KW - cache memory
KW - memory hierarchy
UR - http://www.scopus.com/inward/record.url?scp=77954468410&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77954468410&partnerID=8YFLogxK
U2 - 10.1145/1787275.1787315
DO - 10.1145/1787275.1787315
M3 - Conference contribution
AN - SCOPUS:77954468410
SN - 9781450300445
T3 - CF 2010 - Proceedings of the 2010 Computing Frontiers Conference
SP - 131
EP - 139
BT - CF 2010 - Proceedings of the 2010 Computing Frontiers Conference
T2 - 7th ACM International Conference on Computing Frontiers, CF'10
Y2 - 17 May 2010 through 19 May 2010
ER -