TY - JOUR
T1 - CryptoNAS
T2 - 34th Conference on Neural Information Processing Systems, NeurIPS 2020
AU - Ghodsi, Zahra
AU - Veldanda, Akshaj
AU - Reagen, Brandon
AU - Garg, Siddharth
N1 - Funding Information:
This project was funded in part by NSF grants #1801495 and #1646671.
Publisher Copyright:
© 2020 Neural information processing systems foundation. All rights reserved.
PY - 2020
Y1 - 2020
N2 - Machine learning as a service has given raise to privacy concerns surrounding clients’ data and providers’ models and has catalyzed research in private inference (PI): methods to process inferences without disclosing inputs. Recently, researchers have adapted cryptographic techniques to show PI is possible, however all solutions increase inference latency beyond practical limits. This paper makes the observation that existing models are ill-suited for PI and proposes a novel NAS method, named CryptoNAS, for finding and tailoring models to the needs of PI. The key insight is that in PI operator latency costs are inverted: non-linear operations (e.g., ReLU) dominate latency, while linear layers become effectively free. We develop the idea of a ReLU budget as a proxy for inference latency and use CryptoNAS to build models that maximize accuracy within a given budget. CryptoNAS improves accuracy by 3.4% and latency by 2.4× over the state-of-the-art.
AB - Machine learning as a service has given raise to privacy concerns surrounding clients’ data and providers’ models and has catalyzed research in private inference (PI): methods to process inferences without disclosing inputs. Recently, researchers have adapted cryptographic techniques to show PI is possible, however all solutions increase inference latency beyond practical limits. This paper makes the observation that existing models are ill-suited for PI and proposes a novel NAS method, named CryptoNAS, for finding and tailoring models to the needs of PI. The key insight is that in PI operator latency costs are inverted: non-linear operations (e.g., ReLU) dominate latency, while linear layers become effectively free. We develop the idea of a ReLU budget as a proxy for inference latency and use CryptoNAS to build models that maximize accuracy within a given budget. CryptoNAS improves accuracy by 3.4% and latency by 2.4× over the state-of-the-art.
UR - http://www.scopus.com/inward/record.url?scp=85102548032&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102548032&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85102548032
VL - 2020-December
JO - Advances in Neural Information Processing Systems
JF - Advances in Neural Information Processing Systems
SN - 1049-5258
Y2 - 6 December 2020 through 12 December 2020
ER -