PriViT: Vision Transformers for Private Inference

Naren Dhyani, Jianqiao Cambridge Mo, Patrick Yubeaton, Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde

Research output: Contribution to journalArticlepeer-review

Abstract

The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications. However, ViTs are ill-suited for private inference using secure multi-party computation (MPC) protocols, due to the large number of non-polynomial operations (self-attention, feed-forward rectifiers, layer normalization). We develop PriViT, a gradient-based algorithm to selectively Taylorize nonlinearities in ViTs while maintaining their prediction accuracy. Our algorithm is conceptually very simple, easy to implement, and achieves improved performance over existing MPC-friendly transformer architectures in terms of the latency-accuracy Pareto frontier.

Original languageEnglish (US)
JournalTransactions on Machine Learning Research
Volume2024
StatePublished - 2024

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'PriViT: Vision Transformers for Private Inference'. Together they form a unique fingerprint.

Cite this