Loading…

Distinguishing raw pu-erh tea production regions through a combination of HS-SPME-GC-MS and machine learning algorithms

The authenticity of the geographical origin of agricultural products has received widespread attention. Tea tree varieties, processing processes and origins influence the quality and price of raw pu-erh tea (RPT). This study distinguished RPT from 10 different production areas through headspace soli...

Full description

Saved in:
Bibliographic Details
Published in:Food science & technology 2023-08, Vol.185, p.115140, Article 115140
Main Authors: Xiong, Zhichao, Feng, Wanzhen, Xia, Dongzhou, Zhang, Jixin, Wei, Yuming, Li, Tiehan, Huang, Junlan, Wang, Yujie, Ning, Jingming
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The authenticity of the geographical origin of agricultural products has received widespread attention. Tea tree varieties, processing processes and origins influence the quality and price of raw pu-erh tea (RPT). This study distinguished RPT from 10 different production areas through headspace solid-phase microextraction-gas chromatography-mass spectrometry (HS-SPME-GC-MS) combined with orthogonal partial least squares–discriminant analysis (OPLS-DA) model and machine learning algorithms. Among the thirty-five types of common volatiles identified, pentanal, heptanal, naphthalene, cedrol, and 2,6-di-tert-butylbenzoquinone were considered the key differential compounds distinguishing the 10 different production areas of the RPT samples through the screening of variable importance in projection values of the OPLS-DA and coefficient weights of the linear discriminant analysis function. Among them, heptanal and 2,6-di-tert-butylbenzoquinone had the highest content in West of Bingdao and the lowest content in Nannuoshan. The random forest algorithm achieved a discrimination accuracy of 98.4% based on the discrimination of five key compounds in 63 RPT samples. The random forest model was demonstrated to be reliable and valid by using receiver operating characteristic curves (area under the curve = 0.7603). The study results serve as a reference for the differentiation of 10 production areas of RPT. •Identification of 35 common volatile compounds from 63 raw pu-erh tea samples.•There was strong correlation between production sites and volatile compounds.•5 Key difference compounds were found to distinguish between the 10 regions.•A prediction model was established for the origin of raw pu-erh tea.
ISSN:0023-6438
1096-1127
DOI:10.1016/j.lwt.2023.115140