TY - GEN
T1 - Low-Precision Arithmetic for Fast Gaussian Processes
AU - Maddox, Wesley J.
AU - Potapczynski, Andres
AU - Wilson, Andrew Gordon
N1 - Funding Information:
Acknowledgements We would like to thank Chris De Sa and Ke Alexander Wang for helpful discussions. This research is supported by an Amazon Research Award, Facebook Research, Google Research, Capital One, NSF CAREER IIS-2145492, NSF I-DISRE 193471, NIH R01DA048764-01A1, NSF IIS-1910266, and NSF 1922658 NRT-HDR.
Publisher Copyright:
© 2022 Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022. All right reserved.
PY - 2022
Y1 - 2022
N2 - Low-precision arithmetic has had a transformative effect on the training of neural networks, reducing computation, memory and energy requirements. However, despite its promise, low-precision arithmetic has received little attention for Gaussian process (GP) training, largely because GPs require sophisticated linear algebra routines that are unstable in low-precision. We study the different failure modes that can occur when training GPs in half precision. To circumvent these failure modes, we propose a multi-faceted approach involving conjugate gradients with re-orthogonalization, mixed precision, and preconditioning. Our approach significantly improves the numerical stability and practical performance of conjugate gradients in low-precision over a wide range of settings, enabling GPs to train on 1.8 million data points in 10 hours on a single GPU, without requiring any sparse approximations.
AB - Low-precision arithmetic has had a transformative effect on the training of neural networks, reducing computation, memory and energy requirements. However, despite its promise, low-precision arithmetic has received little attention for Gaussian process (GP) training, largely because GPs require sophisticated linear algebra routines that are unstable in low-precision. We study the different failure modes that can occur when training GPs in half precision. To circumvent these failure modes, we propose a multi-faceted approach involving conjugate gradients with re-orthogonalization, mixed precision, and preconditioning. Our approach significantly improves the numerical stability and practical performance of conjugate gradients in low-precision over a wide range of settings, enabling GPs to train on 1.8 million data points in 10 hours on a single GPU, without requiring any sparse approximations.
UR - http://www.scopus.com/inward/record.url?scp=85146148353&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85146148353&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85146148353
T3 - Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022
SP - 1306
EP - 1316
BT - Proceedings of the 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022
PB - Association For Uncertainty in Artificial Intelligence (AUAI)
T2 - 38th Conference on Uncertainty in Artificial Intelligence, UAI 2022
Y2 - 1 August 2022 through 5 August 2022
ER -