Loading…

Best of both worlds: An expansion of the state of the art pKa model with data from three industrial partners

In a unique collaboration between Simulations Plus and several industrial partners, we were able to develop a new version 11.0 of the previously published in silico pKa model, S+pKa, with considerably improved prediction accuracy. The model's training set was vastly expanded by large amounts of...

Full description

Saved in:
Bibliographic Details
Published in:Molecular informatics 2024-10, Vol.43 (10), p.e202400088-n/a
Main Authors: Fraczkiewicz, Robert, Quoc Nguyen, Huy, Wu, Newton, Kausch‐Busies, Nina, Grimbs, Sergio, Sommer, Kai, ter Laak, Antonius, Günther, Judith, Wagner, Björn, Reutlinger, Michael
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In a unique collaboration between Simulations Plus and several industrial partners, we were able to develop a new version 11.0 of the previously published in silico pKa model, S+pKa, with considerably improved prediction accuracy. The model's training set was vastly expanded by large amounts of experimental data obtained from F. Hoffmann‐La Roche AG, Genentech Inc., and the Crop Science division of Bayer AG. The previous v7.0 of S+pKa was trained on data from public sources and the Pharmaceutical division of Bayer AG. The model has shown dramatic improvements in predictive accuracy when externally validated on three new contributor compound sets. Less expected was v11.0’s improvement in prediction on new compounds developed at Bayer Pharma after v7.0 was released (2013–2023), even without contributing additional data to v11.0. We illustrate chemical space coverage by chemistries encountered in the five domains, public and industrial, outline model construction, and discuss factors contributing to model's success. This work is a follow‐up to our previous “Best of Both Worlds” 2015 publication in Journal of Chemical Information and Modeling. It was met with a great interest from the Journal readers enjoying 4771 views and 79 citations to date. Back then, we have described S+pKa ‐ a novel predictive model of ionization constants built from a combined large data sets: one compiled from scientific literature and one obtained from Bayer Pharmaceuticals AG. The S+pKa has been upgraded with three additional large data sets from F. Hoffmann‐La Roche, Genentech, and Bayer CropScience. The present work offers new insights into chemical spaces covered by compounds from all participating companies and public domain, as well as new insights into their influence on model's predictive accuracy.
ISSN:1868-1743
1868-1751
1868-1751
DOI:10.1002/minf.202400088