TY - GEN
T1 - Generating High-Performance Number Theoretic Transform Implementations for Vector Architectures
AU - Zhang, Naifeng
AU - Ebel, Austin
AU - Neda, Negar
AU - Brinich, Patrick
AU - Reynwar, Benedict
AU - Schmidt, Andrew G.
AU - Franusich, Mike
AU - Johnson, Jeremy
AU - Reagen, Brandon
AU - Franchetti, Franz
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Fully homomorphic encryption (FHE) offers the ability to perform computations directly on encrypted data by encoding numerical vectors onto mathematical structures. However, the adoption of FHE is hindered by substantial overheads that make it impractical for many applications. Number theoretic transforms (NTTs) are a key optimization technique for FHE by accelerating vector convolutions. Towards practical usage of FHE, we propose to use SPIRAL, a code generator renowned for generating efficient linear transform implementations, to generate high-performance NTT on vector architectures. We identify suitable NTT algorithms and translate the dataflow graphs of those algorithms into SPIRAL's internal mathematical representations. We then implement the entire workflow required for generating efficient vectorized NTT code. In this work, we target the Ring Processing Unit (RPU), a multitile long vector accelerator designed for FHE computations. On average, the SPIRAL-generated NTT kernel achieves a 1.7x speedup over naive implementations on RPU, showcasing the effectiveness of our approach towards maximizing performance for NTT computations on vector architectures.
AB - Fully homomorphic encryption (FHE) offers the ability to perform computations directly on encrypted data by encoding numerical vectors onto mathematical structures. However, the adoption of FHE is hindered by substantial overheads that make it impractical for many applications. Number theoretic transforms (NTTs) are a key optimization technique for FHE by accelerating vector convolutions. Towards practical usage of FHE, we propose to use SPIRAL, a code generator renowned for generating efficient linear transform implementations, to generate high-performance NTT on vector architectures. We identify suitable NTT algorithms and translate the dataflow graphs of those algorithms into SPIRAL's internal mathematical representations. We then implement the entire workflow required for generating efficient vectorized NTT code. In this work, we target the Ring Processing Unit (RPU), a multitile long vector accelerator designed for FHE computations. On average, the SPIRAL-generated NTT kernel achieves a 1.7x speedup over naive implementations on RPU, showcasing the effectiveness of our approach towards maximizing performance for NTT computations on vector architectures.
KW - Fully homomorphic encryption
KW - SPIRAL
KW - code generation
KW - number theoretic transform
KW - vectorization
UR - http://www.scopus.com/inward/record.url?scp=85182594953&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85182594953&partnerID=8YFLogxK
U2 - 10.1109/HPEC58863.2023.10363559
DO - 10.1109/HPEC58863.2023.10363559
M3 - Conference contribution
AN - SCOPUS:85182594953
T3 - 2023 IEEE High Performance Extreme Computing Conference, HPEC 2023
BT - 2023 IEEE High Performance Extreme Computing Conference, HPEC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE High Performance Extreme Computing Conference, HPEC 2023
Y2 - 25 September 2023 through 29 September 2023
ER -