Loading…

Discriminative models using molecular descriptors for predicting increased serum ALT levels in repeated-dose toxicity studies of rats

•Discriminative models for predicting increased serum ALT levels in rats were developed.•In order to develop the discriminative models, an in vivo database was used and logistic regression models were applied.•The k-medoids method was used for selection of molecular descriptors.•The Synthetic Minori...

Full description

Saved in:

Bibliographic Details
Published in:	Computational toxicology 2018-05, Vol.6, p.64-70
Main Authors:	Takeshita, Jun-ichi, Nakayama, Haruka, Kitsunai, Yoko, Tanabe, Misako, Oki, Hitomi, Sasaki, Takamitsu, Yoshinari, Kouichi
Format:	Article
Language:	English
Subjects:	Discriminative models Feature selection Hepatotoxicity Imbalanced data set Molecular descriptors
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	•Discriminative models for predicting increased serum ALT levels in rats were developed.•In order to develop the discriminative models, an in vivo database was used and logistic regression models were applied.•The k-medoids method was used for selection of molecular descriptors.•The Synthetic Minority Over-sampling Technique (SMOTE) algorithm was used for resolving imbalanced training data sets. The demand for alternatives to animal experiment-based assessment is increasing. Alternatives for assessing repeated-dose toxicity, however, have yet to be developed. Our aim was to develop discriminative models for predicting an increase in serum ALT levels in rats, using molecular descriptors. In vivo data for rats in the training data sets were obtained using the Hazard Evaluation Support System Integrated Platform (HESS), and molecular descriptors were calculated using DRAGON 6. We developed the discriminative models based on logistic regression models; however, there were two statistical difficulties to be overcome: (i) the number of molecular descriptors was much greater than the number of compounds; (ii) the training data sets were imbalanced. In order to overcome these difficulties, the k-medoids method was employed in the case of the first difficulty, and the Synthetic Minority Over-sampling Technique (SMOTE) algorithm in the case of the second. One of the resulting models showed predictive capability, with sensitivity of 0.783, specificity of 0.745, and concordance of 0.750. Our results show that a statistical learning approach can create a discriminative model with high predictive capability using only information on the molecular descriptors of chemicals.
ISSN:	2468-1113 2468-1113
DOI:	10.1016/j.comtox.2017.05.002