Loading…

High-throughput discovery of chemical structure-polarity relationships combining automation and machine learning techniques

As an essential attribute of organic compounds, polarity has a profound influence on many molecular properties such as solubility and phase transition temperature. Thin layer chromatography (TLC) represents a commonly used technique for polarity measurement. However, current TLC analysis presents se...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2022-02
Main Authors: Xu, Hao, Lin, Jinglong, Liu, Qianyi, Chen, Yuntian, Zhang, Jianning, Yang, Yang, Young, Michael C, Xu, Yan, Zhang, Dongxiao, Mo, Fanyang
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As an essential attribute of organic compounds, polarity has a profound influence on many molecular properties such as solubility and phase transition temperature. Thin layer chromatography (TLC) represents a commonly used technique for polarity measurement. However, current TLC analysis presents several problems, including the need for a large number of attempts to obtain suitable conditions, as well as irreproducibility due to non-standardization. Herein, we describe an automated experiment system for TLC analysis. This system is designed to conduct TLC analysis automatically, facilitating high-throughput experimentation by collecting large experimental data under standardized conditions. Using these datasets, machine learning (ML) methods are employed to construct surrogate models correlating organic compounds' structures and their polarity using retardation factor (Rf). The trained ML models are able to predict the Rf value curve of organic compounds with high accuracy. Furthermore, the constitutive relationship between the compound and its polarity can also be discovered through these modeling methods, and the underlying mechanism is rationalized through adsorption theories. The trained ML models not only reduce the need for empirical optimization currently required for TLC analysis, but also provide general guidelines for the selection of conditions, making TLC an easily accessible tool for the broad scientific community.
ISSN:2331-8422
DOI:10.48550/arxiv.2202.05962