TY - JOUR
T1 - Mosaic Pages
T2 - Big TLB Reach with Small Pages
AU - Han, Jaehyun
AU - Gosakan, Krishnan
AU - Kuszmaul, William
AU - Mubarek, Ibrahim N.
AU - Mukherjee, Nirjhar
AU - Sriram, Karthik
AU - Tagliavini, Guido
AU - West, Evan
AU - Bender, Michael A.
AU - Bhattacharjee, Abhishek
AU - Conway, Alex
AU - Farach-Colton, Martin
AU - Gandhi, Jayneel
AU - Johnson, Rob
AU - Kannan, Sudarsun
AU - Porter, Donald E.
N1 - Publisher Copyright:
© 1981-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - This article introduces mosaic pages, which increase translation lookaside buffer (TLB) reach by compressing multiple, discrete translations into one TLB entry. Mosaic leverages virtual contiguity for locality, but does not use physical contiguity. Mosaic relies on recent advances in hashing theory to constrain memory mappings, in order to realize this physical address compression without reducing memory utilization or increasing swapping. Mosaic reduces TLB misses in several workloads by 6%-81%. Our results show that Mosaics constraints on memory mappings do not harm performance, we never see conflicts before memory is 98% full in our experiments at which point a traditional design would also likely swap. Timing and area analyses on a commercial 28-nm CMOS process indicate that the hashing required on the critical path can run at a maximum frequency of 4 GHz, indicating that a Mosaic TLB is unlikely to affect clock frequency.
AB - This article introduces mosaic pages, which increase translation lookaside buffer (TLB) reach by compressing multiple, discrete translations into one TLB entry. Mosaic leverages virtual contiguity for locality, but does not use physical contiguity. Mosaic relies on recent advances in hashing theory to constrain memory mappings, in order to realize this physical address compression without reducing memory utilization or increasing swapping. Mosaic reduces TLB misses in several workloads by 6%-81%. Our results show that Mosaics constraints on memory mappings do not harm performance, we never see conflicts before memory is 98% full in our experiments at which point a traditional design would also likely swap. Timing and area analyses on a commercial 28-nm CMOS process indicate that the hashing required on the critical path can run at a maximum frequency of 4 GHz, indicating that a Mosaic TLB is unlikely to affect clock frequency.
UR - http://www.scopus.com/inward/record.url?scp=85195417918&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85195417918&partnerID=8YFLogxK
U2 - 10.1109/MM.2024.3409181
DO - 10.1109/MM.2024.3409181
M3 - Article
AN - SCOPUS:85195417918
SN - 0272-1732
VL - 44
SP - 52
EP - 59
JO - IEEE Micro
JF - IEEE Micro
IS - 4
ER -