Loading…

Application of machine learning algorithms to screen potential biomarkers under cadmium exposure based on human urine metabolic profiles

Exposure to environmental cadmium increases the health risk of residents. Early urine metabolic detection using high-resolution mass spectrometry and machine learning algorithms would be advantageous to predict the adverse health effects. Here, we conducted machine learning approaches to screen pote...

Full description

Saved in:
Bibliographic Details
Published in:Chinese chemical letters 2022-12, Vol.33 (12), p.5184-5188
Main Authors: Zeng, Ting, Liang, Yanshan, Dai, Qingyuan, Tian, Jinglin, Chen, Jinyao, Lei, Bo, Yang, Zhu, Cai, Zongwei
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Exposure to environmental cadmium increases the health risk of residents. Early urine metabolic detection using high-resolution mass spectrometry and machine learning algorithms would be advantageous to predict the adverse health effects. Here, we conducted machine learning approaches to screen potential biomarkers under cadmium exposure in 403 urine samples. In positive and negative ionization mode, 4207 and 3558 features were extracted, respectively. We compared seven machine learning algorithms and found that the extreme gradient boosting (XGBoost) and random forest (RF) classifiers showed better accuracy and predictive performance than others. Following 5-fold cross-validation, the value of area under curve (AUC) was both 0.93 for positive and negative ionization modes in XGBoost classifier. In the RF classifier, AUC were 0.80 and 0.84 for positive and negative ionization modes, respectively. We then identified a biomarker panel based on XGBoost and RF classifiers. The incorporation of machine learning models into urine analysis using high-resolution mass spectrometry could allow a convenient assessment of cadmium exposure. [Display omitted] On a cohort of 403 volunteers who had been exposed to cadmium, high-resolution mass spectrometry-based urine metabolic detection was conducted, seven machine learning algorithms on the LCHRMS data set were compared, and a biomarker panel based on the selected machine learning mode were identified. The extreme gradient boosting and random forest classifiers showed better accuracy and predictive performance than others which indicates this study has added a new reference for selecting data-driven machine learning algorithms for a metabolic analysis of urine under cadmium exposure.
ISSN:1001-8417
1878-5964
DOI:10.1016/j.cclet.2022.03.020