Loading…

A two-layer aggregation model with effective consistency for large-scale Gaussian process regression

To scale full Gaussian process (GP) to large-scale data sets, aggregation models divide the dataset into independent subsets for factorized training, and then aggregate predictions from distributed experts. Some aggregation models have been able to produce consistent predictions which converge to th...

Full description

Saved in:
Bibliographic Details
Published in:Engineering applications of artificial intelligence 2021-11, Vol.106, p.104449, Article 104449
Main Authors: Wang, Wengsheng, Zhou, Changkai
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:To scale full Gaussian process (GP) to large-scale data sets, aggregation models divide the dataset into independent subsets for factorized training, and then aggregate predictions from distributed experts. Some aggregation models have been able to produce consistent predictions which converge to the latent function when data size approaches infinity. However, these consistent predictions will become ineffective due to the limited subset size of experts. Oriented by the transition from theory to practice, the key idea is using Generalized Robust Bayesian Committee Machine (GRBCM) with corrective function to replace experts of Generalized Product of Experts (GPoE) which focuses on global information, in order to get rid of the limitation of the experts’ size. Such a nested two-layer structure enables the proposed Generalized Product of Generalized Robust Bayesian Committee Machine (GPoGRBCM) to provide effective predictions on large-scale datasets and to inherit virtues of aggregations, e.g., a slightly flawed Bayesian inference framework, distributed/parallel computing. Furthermore, we perform comparisons of GPoGRBCM against the state-of-the-art aggregation models on one toy example and six real-world datasets with up to more than 3 million training points, showing dramatic performance improvement on scalability, capability, controllability, and robustness. •The local GP model reduces the time consumption and keeps the precise prediction.•Theoretically, the accuracy of predictions of novel aggregation models is compared.•In the application, multiple data sets and perspectives are used to judge the model.
ISSN:0952-1976
1873-6769
DOI:10.1016/j.engappai.2021.104449