Scalable log determinants for Gaussian process kernel learning

Kun Dong, David Eriksson, Hannes Nickisch, David Bindel, Andrew Gordon Wilson

Research output: Contribution to journalConference article

Abstract

For applications as varied as Bayesian neural networks, determinantal point processes, elliptical graphical models, and kernel learning for Gaussian processes (GPs), one must compute a log determinant of an n × n positive definite matrix, and its derivatives - leading to prohibitive O(n3) computations. We propose novel O(n) approaches to estimating these quantities from only fast matrix vector multiplications (MVMs). These stochastic approximations are based on Chebyshev, Lanczos, and surrogate models, and converge quickly even for kernel matrices that have challenging spectra. We leverage these approximations to develop a scalable Gaussian process approach to kernel learning. We find that Lanczos is generally superior to Chebyshev for kernel learning, and that a surrogate approach can be highly efficient and accurate with popular kernels.

Original languageEnglish (US)
Pages (from-to)6328-6338
Number of pages11
JournalAdvances in Neural Information Processing Systems
Volume2017-December
StatePublished - 2017
Event31st Annual Conference on Neural Information Processing Systems, NIPS 2017 - Long Beach, United States
Duration: Dec 4 2017Dec 9 2017

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Fingerprint Dive into the research topics of 'Scalable log determinants for Gaussian process kernel learning'. Together they form a unique fingerprint.

Cite this