TY - GEN
T1 - Orion
T2 - 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2025
AU - Ebel, Austin
AU - Garimella, Karthik
AU - Reagen, Brandon
N1 - Publisher Copyright:
© 2025 ACM.
PY - 2025/3/30
Y1 - 2025/3/30
N2 - Fully Homomorphic Encryption (FHE) has the potential to substantially improve privacy and security by enabling computation directly on encrypted data. This is especially true with deep learning, as today, many popular user services are powered by neural networks in the cloud. Beyond its well-known high computational costs, one of the major challenges facing wide-scale deployment of FHE-secured neural inference is effectively mapping these networks to FHE primitives. FHE poses many programming challenges including packing large vectors, managing accumulated noise, and translating arbitrary and general-purpose programs to the limited instruction set provided by FHE. These challenges make building large FHE neural networks intractable using the tools available today. In this paper we address these challenges with Orion, a fully-automated framework for private neural inference using FHE. Orion accepts deep neural networks written in PyTorch and translates them into efficient FHE programs. We achieve this by proposing a novel single-shot multiplexed packing strategy for arbitrary convolutions and through a new, efficient technique to automate bootstrap placement and scale management. We evaluate Orion on common benchmarks used by the FHE deep learning community and outperform state-of-the-art by 2.38 × on ResNet-20, the largest network they report. Orion's techniques enable processing much deeper and larger networks. We demonstrate this by evaluating ResNet-50 on ImageNet and present the first high-resolution FHE object detection experiments using a YOLO-v1 model with 139 million parameters. Orion is open-source for all to use at: \hrefhttps://github.com/baahl-nyu/orion https://github.com/baahl-nyu/orion.
AB - Fully Homomorphic Encryption (FHE) has the potential to substantially improve privacy and security by enabling computation directly on encrypted data. This is especially true with deep learning, as today, many popular user services are powered by neural networks in the cloud. Beyond its well-known high computational costs, one of the major challenges facing wide-scale deployment of FHE-secured neural inference is effectively mapping these networks to FHE primitives. FHE poses many programming challenges including packing large vectors, managing accumulated noise, and translating arbitrary and general-purpose programs to the limited instruction set provided by FHE. These challenges make building large FHE neural networks intractable using the tools available today. In this paper we address these challenges with Orion, a fully-automated framework for private neural inference using FHE. Orion accepts deep neural networks written in PyTorch and translates them into efficient FHE programs. We achieve this by proposing a novel single-shot multiplexed packing strategy for arbitrary convolutions and through a new, efficient technique to automate bootstrap placement and scale management. We evaluate Orion on common benchmarks used by the FHE deep learning community and outperform state-of-the-art by 2.38 × on ResNet-20, the largest network they report. Orion's techniques enable processing much deeper and larger networks. We demonstrate this by evaluating ResNet-50 on ImageNet and present the first high-resolution FHE object detection experiments using a YOLO-v1 model with 139 million parameters. Orion is open-source for all to use at: \hrefhttps://github.com/baahl-nyu/orion https://github.com/baahl-nyu/orion.
KW - compilers
KW - cryptography
KW - fully homomorphic encryption
KW - privacy-preserving machine learning
UR - http://www.scopus.com/inward/record.url?scp=105002564330&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=105002564330&partnerID=8YFLogxK
U2 - 10.1145/3676641.3716008
DO - 10.1145/3676641.3716008
M3 - Conference contribution
AN - SCOPUS:105002564330
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 734
EP - 749
BT - ASPLOS 2025 - Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
PB - Association for Computing Machinery
Y2 - 30 March 2025 through 3 April 2025
ER -