Loading…

Stochastic Gradient Descent for Gaussian Processes Done Right

As is well known, both sampling from the posterior and computing the mean of the posterior in Gaussian process regression reduces to solving a large linear system of equations. We study the use of stochastic gradient descent for solving this linear system, and show that when \emph{done right} -- by...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2024-04
Main Authors:	Jihao Andreas Lin, Padhy, Shreyas, Antorán, Javier, Tripp, Austin, Terenin, Alexander, Szepesvári, Csaba, José Miguel Hernández-Lobato, Janz, David
Format:	Article
Language:	English
Subjects:	Ablation Algorithms Conjugate gradient method Deep learning Gaussian process Graph neural networks Machine learning Optimization Regression
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	As is well known, both sampling from the posterior and computing the mean of the posterior in Gaussian process regression reduces to solving a large linear system of equations. We study the use of stochastic gradient descent for solving this linear system, and show that when \emph{done right} -- by which we mean using specific insights from the optimisation and kernel communities -- stochastic gradient descent is highly effective. To that end, we introduce a particularly simple \emph{stochastic dual descent} algorithm, explain its design in an intuitive manner and illustrate the design choices through a series of ablation studies. Further experiments demonstrate that our new method is highly competitive. In particular, our evaluations on the UCI regression tasks and on Bayesian optimisation set our approach apart from preconditioned conjugate gradients and variational Gaussian process approximations. Moreover, our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
ISSN:	2331-8422