TY - GEN
T1 - Circa
T2 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
AU - Ghodsi, Zahra
AU - Jha, Nandan Kumar
AU - Reagen, Brandon
AU - Garg, Siddharth
N1 - Funding Information:
This work was supported in part by the Applications Driving Architectures (ADA) Research Center, a JUMP Center co-sponsored by Semiconductor Research Consortium (SRC) and the Defense Advanced Research Projects Agency (DARPA), by the DARPA Data Protection in Virtual Environments (DPRIVE) program (contract HR0011-21-9-0003), and by the National Science Foundation (NSF) grants 1646671 and 1801495. The views, opinions and/or findings expressed are those of the author and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.
Publisher Copyright:
© 2021 Neural information processing systems foundation. All rights reserved.
PY - 2021
Y1 - 2021
N2 - The simultaneous rise of machine learning as a service and concerns over user privacy have increasingly motivated the need for private inference (PI). While recent work demonstrates PI is possible using cryptographic primitives, the computational overheads render it impractical. State-of-art deep networks are inadequate in this context because the source of slowdown in PI stems from the ReLU operations whereas optimizations for plaintext inference focus on reducing FLOPs. In this paper we re-think ReLU computations and propose optimizations for PI tailored to properties of neural networks. Specifically, we reformulate ReLU as an approximate sign test and introduce a novel truncation method for the sign test that significantly reduces the cost per ReLU. These optimizations result in a specific type of stochastic ReLU. The key observation is that the stochastic fault behavior is well suited for the fault-tolerant properties of neural network inference. Thus, we provide significant savings without impacting accuracy. We collectively call the optimizations Circa and demonstrate improvements of up to 4.7× storage and 3× runtime over baseline implementations; we further show that Circa can be used on top of recent PI optimizations to obtain 1.8× additional speedup.
AB - The simultaneous rise of machine learning as a service and concerns over user privacy have increasingly motivated the need for private inference (PI). While recent work demonstrates PI is possible using cryptographic primitives, the computational overheads render it impractical. State-of-art deep networks are inadequate in this context because the source of slowdown in PI stems from the ReLU operations whereas optimizations for plaintext inference focus on reducing FLOPs. In this paper we re-think ReLU computations and propose optimizations for PI tailored to properties of neural networks. Specifically, we reformulate ReLU as an approximate sign test and introduce a novel truncation method for the sign test that significantly reduces the cost per ReLU. These optimizations result in a specific type of stochastic ReLU. The key observation is that the stochastic fault behavior is well suited for the fault-tolerant properties of neural network inference. Thus, we provide significant savings without impacting accuracy. We collectively call the optimizations Circa and demonstrate improvements of up to 4.7× storage and 3× runtime over baseline implementations; we further show that Circa can be used on top of recent PI optimizations to obtain 1.8× additional speedup.
UR - http://www.scopus.com/inward/record.url?scp=85124788453&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85124788453&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85124788453
T3 - Advances in Neural Information Processing Systems
SP - 2241
EP - 2252
BT - Advances in Neural Information Processing Systems 34 - 35th Conference on Neural Information Processing Systems, NeurIPS 2021
A2 - Ranzato, Marc'Aurelio
A2 - Beygelzimer, Alina
A2 - Dauphin, Yann
A2 - Liang, Percy S.
A2 - Wortman Vaughan, Jenn
PB - Neural information processing systems foundation
Y2 - 6 December 2021 through 14 December 2021
ER -