Loading…

Construction of interpretable ensemble learning models for predicting bioaccumulation parameters of organic chemicals in fish

Accurate prediction of bioaccumulation parameters is essential for assessing exposure, hazards, and risks of chemicals. However, the majority of prediction models on bioaccumulation parameters are individual models based on a single algorithm and lack model interpretation, resulting in unsatisfactor...

Full description

Saved in:
Bibliographic Details
Published in:Journal of hazardous materials 2025-01, Vol.482, p.136606, Article 136606
Main Authors: Zhu, Minghua, Xiao, Zijun, Zhang, Tao, Lu, Guanghua
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Accurate prediction of bioaccumulation parameters is essential for assessing exposure, hazards, and risks of chemicals. However, the majority of prediction models on bioaccumulation parameters are individual models based on a single algorithm and lack model interpretation, resulting in unsatisfactory prediction accuracy due to inherent constraints of the algorithm and weak interpretability. Ensemble learning (EL) that combine multiple algorithms, coupled with SHapley Additive exPlanation (SHAP) method, may overcome the limitations. Herein, EL models were constructed for three bioaccumulation parameters using datasets covering 2496 chemicals. The EL models demonstrated superior prediction accuracy compared to both individual models developed in this study and those from previous research, achieving a coefficient of determination of up to 0.861 on the validation sets. Applicability domains were characterized using a structure-activity landscape-based (abbreviated as ADSAL) methodology. The optimal EL models, together with the ADSAL, were successfully used to predict bioaccumulation parameters for 4374 chemicals included in the Inventory of Existing Chemical Substances of China. Model interpretation using the SHAP method offered insight into key features influencing bioaccumulation potential, including hydrophobicity, water solubility, polarizability, ionization potential, weight, and volume of molecules. Overall, the study provides data and models to support the sound management and risk assessment of chemicals. [Display omitted] •Ensemble learning (EL) and individual models were constructed for predicting bioaccumulation parameters.•EL models exhibited superior prediction accuracy than individual models.•Key factors impacting bioaccumulation were quantified by SHAP analysis.•Optimal EL models with ADSAL were used to predict bioaccumulation parameters for 4374 chemicals.•EL, together with SHAP analysis, has the potential to enhance model performance and interpretability.
ISSN:0304-3894
1873-3336
1873-3336
DOI:10.1016/j.jhazmat.2024.136606