TY - GEN
T1 - Fast kronecker inference in Gaussian processes with non-Gaussian likelihoods
AU - Flaxman, Seth
AU - Wilson, Andrew Gordon
AU - Neill, Daniel B.
AU - Nickisch, Hannes
AU - Smola, Alexander J.
N1 - Funding Information:
This work was partially supported by the National Science Foundation, grant HS-0953330. AGW thanks ONR grant N000141410684 and NIH grant R01GM093156.
PY - 2015
Y1 - 2015
N2 - Gaussian processes (GPS) are a flexible class of methods with state of the art performance on spatial statistics applications. However, GPS require 0(n3) computations and 0(n2) storage, and popular GP kernels are typically limited to smoothing and interpolation. To address these difficulties, Kronecker methods have been used to exploit structure in the GP covariance matrix for scalability, while allowing for expressive kernel learning (Wilson et al., 2014). However, fast Kronecker methods have been confined to Gaussian likelihoods. We propose new scalable Kronecker methods for Gaussian processes with non-Gaussian likelihoods, using a Laplace approximation which involves linear conjugate gradients for inference, and a lower bound on the GP marginal likelihood for kernel learning. Our approach has near linear scaling, requir-ing 0(Dnd+1/d ) operations and O(Dn 2/d) storage, for n training data-points on a dense D > 1 dimensional grid. Moreover, we introduce a log Gaussian Cox process, with highly expressive kernels, for modelling spatiotemporal count processes, and apply it to a point pattern (n = 233,088) of a decade of crime events in Chicago. Using our model, we discover spatially varying multiscale seasonal trends and produce highly accurate long-range local area forecasts.
AB - Gaussian processes (GPS) are a flexible class of methods with state of the art performance on spatial statistics applications. However, GPS require 0(n3) computations and 0(n2) storage, and popular GP kernels are typically limited to smoothing and interpolation. To address these difficulties, Kronecker methods have been used to exploit structure in the GP covariance matrix for scalability, while allowing for expressive kernel learning (Wilson et al., 2014). However, fast Kronecker methods have been confined to Gaussian likelihoods. We propose new scalable Kronecker methods for Gaussian processes with non-Gaussian likelihoods, using a Laplace approximation which involves linear conjugate gradients for inference, and a lower bound on the GP marginal likelihood for kernel learning. Our approach has near linear scaling, requir-ing 0(Dnd+1/d ) operations and O(Dn 2/d) storage, for n training data-points on a dense D > 1 dimensional grid. Moreover, we introduce a log Gaussian Cox process, with highly expressive kernels, for modelling spatiotemporal count processes, and apply it to a point pattern (n = 233,088) of a decade of crime events in Chicago. Using our model, we discover spatially varying multiscale seasonal trends and produce highly accurate long-range local area forecasts.
UR - http://www.scopus.com/inward/record.url?scp=84969506826&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84969506826&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84969506826
T3 - 32nd International Conference on Machine Learning, ICML 2015
SP - 607
EP - 616
BT - 32nd International Conference on Machine Learning, ICML 2015
A2 - Bach, Francis
A2 - Blei, David
PB - International Machine Learning Society (IMLS)
T2 - 32nd International Conference on Machine Learning, ICML 2015
Y2 - 6 July 2015 through 11 July 2015
ER -