Loading…
On QSAR-based cardiotoxicity modeling with the expressiveness-enhanced graph learning model and dual-threshold scheme
Given the direct association with malignant ventricular arrhythmias, cardiotoxicity is a major concern in drug design. In the past decades, computational models based on the quantitative structure-activity relationship have been proposed to screen out cardiotoxic compounds and have shown promising r...
Saved in:
Published in: | Frontiers in physiology 2023-05, Vol.14, p.1156286 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Given the direct association with malignant ventricular arrhythmias, cardiotoxicity is a major concern in drug design. In the past decades, computational models based on the quantitative structure-activity relationship have been proposed to screen out cardiotoxic compounds and have shown promising results. The combination of molecular fingerprint and the machine learning model shows stable performance for a wide spectrum of problems; however, not long after the advent of the graph neural network (GNN) deep learning model and its variant (e.g., graph transformer), it has become the principal way of quantitative structure-activity relationship-based modeling for its high flexibility in feature extraction and decision rule generation. Despite all these progresses, the expressiveness (the ability of a program to identify non-isomorphic graph structures) of the GNN model is bounded by the WL isomorphism test, and a suitable thresholding scheme that relates directly to the sensitivity and credibility of a model is still an open question.
In this research, we further improved the expressiveness of the GNN model by introducing the substructure-aware bias by the graph subgraph transformer network model. Moreover, to propose the most appropriate thresholding scheme, a comprehensive comparison of the thresholding schemes was conducted.
Based on these improvements, the best model attains performance with 90.4% precision, 90.4% recall, and 90.5% F1-score with a dual-threshold scheme (active:
; non-active:
). The improved pipeline (graph subgraph transformer network model and thresholding scheme) also shows its advantages in terms of the activity cliff problem and model interpretability. |
---|---|
ISSN: | 1664-042X 1664-042X |
DOI: | 10.3389/fphys.2023.1156286 |