TY - GEN
T1 - NNSmith
T2 - 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2023
AU - Liu, Jiawei
AU - Lin, Jinkun
AU - Ruffy, Fabian
AU - Tan, Cheng
AU - Li, Jinyang
AU - Panda, Aurojit
AU - Zhang, Lingming
N1 - Funding Information:
We thank the ASPLOS reviewers for their insightful comments. We also thank Yuanyi Zhong, Lingfan Yu and Leyuan Wang for insightful discussions in the early stages of the project, Jinjun Peng for helping open-source the project, and the the NYU IT High Performance Computing group for providing computing resources. This work was partially supported by the National Science Foundation grants CCF-2131943 and CCF-2141474, a Google research award, a Meta research award, an AMD research award, and a gift from Microsoft Corporation.
Publisher Copyright:
© 2023 ACM.
PY - 2023/1/27
Y1 - 2023/1/27
N2 - Deep-learning (DL) compilers such as TVM and TensorRT are increasingly being used to optimize deep neural network (DNN) models to meet performance, resource utilization and other requirements. Bugs in these compilers can result in models whose semantics differ from the original ones, producing incorrect results that corrupt the correctness of downstream applications. However, finding bugs in these compilers is challenging due to their complexity. In this work, we propose a new fuzz testing approach for finding bugs in deep-learning compilers. Our core approach consists of (i) generating diverse yet valid DNN test models that can exercise a large part of the compiler's transformation logic using light-weight operator specifications; (ii) performing gradient-based search to find model inputs that avoid any floating-point exceptional values during model execution, reducing the chance of missed bugs or false alarms; and (iii) using differential testing to identify bugs. We implemented this approach in NNSmith which has found 72 new bugs for TVM, TensorRT, ONNXRuntime, and PyTorch to date. Of these 58 have been confirmed and 51 have been fixed by their respective project maintainers.
AB - Deep-learning (DL) compilers such as TVM and TensorRT are increasingly being used to optimize deep neural network (DNN) models to meet performance, resource utilization and other requirements. Bugs in these compilers can result in models whose semantics differ from the original ones, producing incorrect results that corrupt the correctness of downstream applications. However, finding bugs in these compilers is challenging due to their complexity. In this work, we propose a new fuzz testing approach for finding bugs in deep-learning compilers. Our core approach consists of (i) generating diverse yet valid DNN test models that can exercise a large part of the compiler's transformation logic using light-weight operator specifications; (ii) performing gradient-based search to find model inputs that avoid any floating-point exceptional values during model execution, reducing the chance of missed bugs or false alarms; and (iii) using differential testing to identify bugs. We implemented this approach in NNSmith which has found 72 new bugs for TVM, TensorRT, ONNXRuntime, and PyTorch to date. Of these 58 have been confirmed and 51 have been fixed by their respective project maintainers.
KW - Compiler Testing
KW - Deep Learning Compilers
KW - Fuzzing
UR - http://www.scopus.com/inward/record.url?scp=85147732300&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85147732300&partnerID=8YFLogxK
U2 - 10.1145/3575693.3575707
DO - 10.1145/3575693.3575707
M3 - Conference contribution
AN - SCOPUS:85147732300
T3 - International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS
SP - 530
EP - 543
BT - ASPLOS 2023 - Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
A2 - Aamodt, Tor M.
A2 - Jerger, Natalie Enright
A2 - Swift, Michael
PB - Association for Computing Machinery
Y2 - 25 March 2023 through 29 March 2023
ER -