Loading…

Improving classification performance of extreme gradient boosting on small-sized dataset to classify Turkish and Italian wines along with elemental profiling by inductively coupled plasma-mass spectrometry

In this study, the classification performance of the extreme gradient boosting algorithm on a small-sized dataset was improved by using a synthetically generated dataset created with kernel density estimation to classify wine samples. The concentration of 29 elements in wine samples produced in Turk...

Full description

Saved in:
Bibliographic Details
Published in:Spectroscopy letters 2022-01, Vol.55 (1), p.1-12
Main Authors: Alp, Hande, Alp, Orkun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this study, the classification performance of the extreme gradient boosting algorithm on a small-sized dataset was improved by using a synthetically generated dataset created with kernel density estimation to classify wine samples. The concentration of 29 elements in wine samples produced in Turkey (domestic) and Italy (imported) was determined by inductively coupled plasma-mass spectrometry and obtained results were used to generate the dataset. Classification of wine samples was firstly assessed with extreme gradient boosting, which is known for overfitting in small-sized datasets, resulting in poor classification performance. To improve the classification performance, a synthetic dataset was created and the algorithm was trained on the synthetic dataset instead of the original dataset. With the proposed method, the accuracy of the model was improved from 76.7% to 81.7%. The precision values for Turkish and Italian wines were increased from 78.4% to 84.1% and from 70.9% to 79.4%, respectively. The variable importance determined by the extreme gradient boosting algorithm showed that beryllium and cesium were significantly more important compared to other elements followed by tin, phosphorus, cobalt, lead, calcium, copper, zinc, and aluminum as the top 10 elements to classify Turkish and Italian wine samples.
ISSN:0038-7010
1532-2289
DOI:10.1080/00387010.2021.2008977