Loading…

Robust Variational Learning for Multiclass Kernel Models With Stein Refinement

Kernel-based models have a strong generalization ability, but most, including SVM, are vulnerable to the curse of kernelization. Moreover, their predictive performance is sensitive to hyperparameter tuning, which demands high computational resources. These problems render kernel methods problematic...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering 2022-09, Vol.34 (9), p.4425-4438
Main Authors: Nguyen, Khanh, Le, Trung, Nguyen, Tu Dinh, Webb, Geoffrey I., Phung, Dinh
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Kernel-based models have a strong generalization ability, but most, including SVM, are vulnerable to the curse of kernelization. Moreover, their predictive performance is sensitive to hyperparameter tuning, which demands high computational resources. These problems render kernel methods problematic when dealing with large-scale datasets. To this end, we first formulate the optimization problem in a kernel-based learning setting as a posterior inference problem, and then develop a rich family of Recurrent Neural Network-based variational inference techniques. Unlike existing literature, which stops at the variational distribution and uses it as the surrogate for the true posterior distribution, here we further leverage Stein Variational Gradient Descent to further bring the variational distribution closer to the true posterior, we refer to this step as Stein Refinement . Putting these altogether, we arrive at a robust and efficient variational learning method for multiclass kernel machines with extremely accurate approximation. Moreover, our formulation enables efficient learning of kernel parameters and hyperparameters which robustifies the proposed method against data uncertainties. The extensive experiments show that without tuning any parameter on modest quantities of data our method obtains comparable accuracy to LIBSVM, a well-known implementation of SVM, and outperforms other baselines, while being able to seamlessly scale with large-scale datasets.
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2020.3041509