Loading…
Using high-fidelity meta-models to improve performance of small dataset trained Bayesian Networks
•Bayesian Networks often cannot be used with small datasets due to accuracy concerns.•Kriging and radial-basis function meta-models are viable options for augmenting datasets.•Bayesian network accuracy increases when using meta-model generated data. Machine Learning (ML) is increasingly being used b...
Saved in:
Published in: | Expert systems with applications 2020-01, Vol.139, p.112830, Article 112830 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •Bayesian Networks often cannot be used with small datasets due to accuracy concerns.•Kriging and radial-basis function meta-models are viable options for augmenting datasets.•Bayesian network accuracy increases when using meta-model generated data.
Machine Learning (ML) is increasingly being used by companies like Google, Amazon and Apple to help identify market trends and predict customer behavior. Continuous improvement and maturing of these ML tools will help improve decision making across a number of industries. Unfortunately, before many ML strategies can be utilized the methods often require large amounts of data. For a number of realistic situations, however, only smaller subsets of data are available (i.e. hundreds to thousands of points). This work explores this problem by investigating the feasibility of using meta-models, specifically Kriging and Radial Basis Functions, to generate data for training a BN when only small amounts of original data are available. This paper details the meta-model creation process and the results of using Particle Swarm Optimization (PSO) for tuning parameters for four network structures trained using three relatively small data sets. Additionally, a series of experiments augment these small datasets by generating ten thousand, one-hundred thousand, and a million synthetic data points using the Kriging and RBF meta-models as well as intelligently establishing prior probabilities using PSO. Results show that augmenting limited existing datasets with meta-model generated data can dramatically affect network accuracy. Overall, the exploratory results presented in this paper demonstrate the feasibility of using meta-model generated data to increase the accuracy of small sample set trained BN. Further developing this method will help underserved areas with access to only small datasets make use of the powerful predictive analytics of ML. |
---|---|
ISSN: | 0957-4174 1873-6793 |
DOI: | 10.1016/j.eswa.2019.112830 |