Loading…
Application of machine learning algorithms to screen potential biomarkers under cadmium exposure based on human urine metabolic profiles
Exposure to environmental cadmium increases the health risk of residents. Early urine metabolic detection using high-resolution mass spectrometry and machine learning algorithms would be advantageous to predict the adverse health effects. Here, we conducted machine learning approaches to screen pote...
Saved in:
Published in: | Chinese chemical letters 2022-12, Vol.33 (12), p.5184-5188 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Exposure to environmental cadmium increases the health risk of residents. Early urine metabolic detection using high-resolution mass spectrometry and machine learning algorithms would be advantageous to predict the adverse health effects. Here, we conducted machine learning approaches to screen potential biomarkers under cadmium exposure in 403 urine samples. In positive and negative ionization mode, 4207 and 3558 features were extracted, respectively. We compared seven machine learning algorithms and found that the extreme gradient boosting (XGBoost) and random forest (RF) classifiers showed better accuracy and predictive performance than others. Following 5-fold cross-validation, the value of area under curve (AUC) was both 0.93 for positive and negative ionization modes in XGBoost classifier. In the RF classifier, AUC were 0.80 and 0.84 for positive and negative ionization modes, respectively. We then identified a biomarker panel based on XGBoost and RF classifiers. The incorporation of machine learning models into urine analysis using high-resolution mass spectrometry could allow a convenient assessment of cadmium exposure.
[Display omitted]
On a cohort of 403 volunteers who had been exposed to cadmium, high-resolution mass spectrometry-based urine metabolic detection was conducted, seven machine learning algorithms on the LCHRMS data set were compared, and a biomarker panel based on the selected machine learning mode were identified. The extreme gradient boosting and random forest classifiers showed better accuracy and predictive performance than others which indicates this study has added a new reference for selecting data-driven machine learning algorithms for a metabolic analysis of urine under cadmium exposure. |
---|---|
ISSN: | 1001-8417 1878-5964 |
DOI: | 10.1016/j.cclet.2022.03.020 |