Loading…

A study of the role of data and model uncertainty in active learning

[Display omitted] •The effects of model uncertainty and data uncertainty are explored separately for active learning strategies on the iterations required to find the best samples.•3 kinds of acquisition functions are compared for the model uncertainty, and the active learning model with the ranking...

Full description

Saved in:

Bibliographic Details
Published in:	Computational materials science 2025-01, Vol.247, p.113512, Article 113512
Main Authors:	Li, Yahao, Jiang, Errui, Ni, Ziqi, Li, Wudi, Huang, Ming, Zhao, Fengyuan, Liu, Fengqi, Ye, Yicong, Bai, Shuxin
Format:	Article
Language:	English
Subjects:	Acquisition function Active learning Materials design Model uncertainty
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	[Display omitted] •The effects of model uncertainty and data uncertainty are explored separately for active learning strategies on the iterations required to find the best samples.•3 kinds of acquisition functions are compared for the model uncertainty, and the active learning model with the ranking of predicted value strategy requires the lowest average number of iterations (1.75).•After the uncertainty of the observations is considered, the active learning iterations of the three strategies at the optimal weighting turns to be similar better, while incorporating noise samples into the augmented dataset would severely deteriorate the active learning recommendation efficiency. Uncertainty-based active learning strategies have demonstrated significant superiority in small data research of materials domain. This study explores the effects of model uncertainty and data uncertainty separately on the performance of active learning strategies, specifically focusing on the number of iterations required to identify the optimal samples. For model uncertainty, three kinds of acquisition functions are compared, including predicted value strategy (PV), ranking of predicted value strategy (PR) and expected improvement strategy (EI). Among these, the active learning model utilizing PR requires the fewest average iterations (1.75). For data uncertainty, we evaluate the iterations of active learning by Gaussian process models that incorporate the uncertainty of the observations and noise samples that takes account into the uncertainty of the input features respectively. The results indicate that the active learning iterations of the three strategies converge to similar at the optimal weighting when the uncertainty of the observations is considered in the model (EI for 1.75, PV for 1.21 and PR for 1.18). In contrast, incorporating noise samples into the augmented dataset after the original samples would severely deteriorate the efficiency of active learning recommendations. Our findings aim to offer guidance for exploring more favorable acquisition functions and methods for active learning strategies.
ISSN:	0927-0256
DOI:	10.1016/j.commatsci.2024.113512