Loading…

Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach

Machine learning (ML) models can, once trained, make reaction barrier predictions in seconds, which is orders of magnitude faster than quantum mechanical (QM) methods such as density functional theory (DFT). However, these ML models need to be trained on large datasets of typically thousands of expe...

Full description

Saved in:

Bibliographic Details
Published in:	Digital discovery 2023-08, Vol.2 (4), p.941-951
Main Authors:	Espley, Samuel G, Farrar, Elliot H. E, Buttar, David, Tomasi, Simone, Grayson, Matthew N
Format:	Article
Language:	English
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c289t-7f3557d5946427bf465424641a4a5a222d844b6a1481d307753268a4f856077b3
cites	cdi_FETCH-LOGICAL-c289t-7f3557d5946427bf465424641a4a5a222d844b6a1481d307753268a4f856077b3
container_end_page	951
container_issue	4
container_start_page	941
container_title	Digital discovery
container_volume	2
creator	Espley, Samuel G Farrar, Elliot H. E Buttar, David Tomasi, Simone Grayson, Matthew N
description	Machine learning (ML) models can, once trained, make reaction barrier predictions in seconds, which is orders of magnitude faster than quantum mechanical (QM) methods such as density functional theory (DFT). However, these ML models need to be trained on large datasets of typically thousands of expensive, high accuracy barriers and do not generalise well beyond the specific reaction for which they are trained. In this work, we demonstrate that transfer learning (TL) can be used to adapt pre-trained Diels-Alder barrier prediction neural networks (NNs) to make predictions for other pericyclic reactions using horizontal TL (hTL) and additionally, at higher levels of theory with diagonal TL (dTL). TL-derived predictions are possible with mean absolute errors (MAEs) below the accepted chemical accuracy threshold of 1 kcal mol −1 , a significant improvement on pre-TL prediction MAEs of >5 kcal mol −1 , and in extremely low data regimes, with as few as 33 and 39 new datapoints needed for hTL and dTL, respectively. Thus, hTL and dTL are powerful options for providing insight into reaction feasibility without the need for extensive high-throughput experimental or computational screening or large dataset generation for training bespoke ML models. Transfer learning (TL) is used to adapt existing neural networks to provide reaction barrier predictions for different reaction classes (horizontal TL) at higher levels of theory (diagonal TL) with tens of datapoints.
doi_str_mv	10.1039/d3dd00085k
format	article
fullrecord	<record><control><sourceid>rsc_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1039_D3DD00085K</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>d3dd00085k</sourcerecordid><originalsourceid>FETCH-LOGICAL-c289t-7f3557d5946427bf465424641a4a5a222d844b6a1481d307753268a4f856077b3</originalsourceid><addsrcrecordid>eNpNkM1LAzEUxIMoWGov3oWchdV8Z9ebtH5hxYuCt-XtJttGt8mSLIj-9UYr1tObYX7Mg0HomJIzSnh1brgxhJBSvu2hCVNcFqQqX_b_6UM0S-k1M0xrSrmaoP4B2rXzFvcWond-haOFdnTB4wZidDYm7Dzuwzs2MEJOV25j0wUGvA7RfQY_Qo_BG2wcrILPZozgU2fjrhKGIYb85wgddNAnO_u9U_R8ffU0vy2Wjzd388tl0bKyGgvdcSm1kZVQgummE0oKljUFARIYY6YUolFARUkNJ1pLzlQJoiulyq7hU3S67W1jSCnarh6i20D8qCmpv6eqF3yx-JnqPsMnWzim9o_bTcm_ACtkZbk</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach</title><source>Alma/SFX Local Collection</source><creator>Espley, Samuel G ; Farrar, Elliot H. E ; Buttar, David ; Tomasi, Simone ; Grayson, Matthew N</creator><creatorcontrib>Espley, Samuel G ; Farrar, Elliot H. E ; Buttar, David ; Tomasi, Simone ; Grayson, Matthew N</creatorcontrib><description>Machine learning (ML) models can, once trained, make reaction barrier predictions in seconds, which is orders of magnitude faster than quantum mechanical (QM) methods such as density functional theory (DFT). However, these ML models need to be trained on large datasets of typically thousands of expensive, high accuracy barriers and do not generalise well beyond the specific reaction for which they are trained. In this work, we demonstrate that transfer learning (TL) can be used to adapt pre-trained Diels-Alder barrier prediction neural networks (NNs) to make predictions for other pericyclic reactions using horizontal TL (hTL) and additionally, at higher levels of theory with diagonal TL (dTL). TL-derived predictions are possible with mean absolute errors (MAEs) below the accepted chemical accuracy threshold of 1 kcal mol −1 , a significant improvement on pre-TL prediction MAEs of >5 kcal mol −1 , and in extremely low data regimes, with as few as 33 and 39 new datapoints needed for hTL and dTL, respectively. Thus, hTL and dTL are powerful options for providing insight into reaction feasibility without the need for extensive high-throughput experimental or computational screening or large dataset generation for training bespoke ML models. Transfer learning (TL) is used to adapt existing neural networks to provide reaction barrier predictions for different reaction classes (horizontal TL) at higher levels of theory (diagonal TL) with tens of datapoints.</description><identifier>ISSN: 2635-098X</identifier><identifier>EISSN: 2635-098X</identifier><identifier>DOI: 10.1039/d3dd00085k</identifier><language>eng</language><ispartof>Digital discovery, 2023-08, Vol.2 (4), p.941-951</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c289t-7f3557d5946427bf465424641a4a5a222d844b6a1481d307753268a4f856077b3</citedby><cites>FETCH-LOGICAL-c289t-7f3557d5946427bf465424641a4a5a222d844b6a1481d307753268a4f856077b3</cites><orcidid>0000-0002-1135-9890 ; 0000-0002-9373-7639 ; 0000-0001-5466-023X ; 0000-0003-3350-2907 ; 0000-0003-2116-7929</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Espley, Samuel G</creatorcontrib><creatorcontrib>Farrar, Elliot H. E</creatorcontrib><creatorcontrib>Buttar, David</creatorcontrib><creatorcontrib>Tomasi, Simone</creatorcontrib><creatorcontrib>Grayson, Matthew N</creatorcontrib><title>Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach</title><title>Digital discovery</title><description>Machine learning (ML) models can, once trained, make reaction barrier predictions in seconds, which is orders of magnitude faster than quantum mechanical (QM) methods such as density functional theory (DFT). However, these ML models need to be trained on large datasets of typically thousands of expensive, high accuracy barriers and do not generalise well beyond the specific reaction for which they are trained. In this work, we demonstrate that transfer learning (TL) can be used to adapt pre-trained Diels-Alder barrier prediction neural networks (NNs) to make predictions for other pericyclic reactions using horizontal TL (hTL) and additionally, at higher levels of theory with diagonal TL (dTL). TL-derived predictions are possible with mean absolute errors (MAEs) below the accepted chemical accuracy threshold of 1 kcal mol −1 , a significant improvement on pre-TL prediction MAEs of >5 kcal mol −1 , and in extremely low data regimes, with as few as 33 and 39 new datapoints needed for hTL and dTL, respectively. Thus, hTL and dTL are powerful options for providing insight into reaction feasibility without the need for extensive high-throughput experimental or computational screening or large dataset generation for training bespoke ML models. Transfer learning (TL) is used to adapt existing neural networks to provide reaction barrier predictions for different reaction classes (horizontal TL) at higher levels of theory (diagonal TL) with tens of datapoints.</description><issn>2635-098X</issn><issn>2635-098X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNpNkM1LAzEUxIMoWGov3oWchdV8Z9ebtH5hxYuCt-XtJttGt8mSLIj-9UYr1tObYX7Mg0HomJIzSnh1brgxhJBSvu2hCVNcFqQqX_b_6UM0S-k1M0xrSrmaoP4B2rXzFvcWond-haOFdnTB4wZidDYm7Dzuwzs2MEJOV25j0wUGvA7RfQY_Qo_BG2wcrILPZozgU2fjrhKGIYb85wgddNAnO_u9U_R8ffU0vy2Wjzd388tl0bKyGgvdcSm1kZVQgummE0oKljUFARIYY6YUolFARUkNJ1pLzlQJoiulyq7hU3S67W1jSCnarh6i20D8qCmpv6eqF3yx-JnqPsMnWzim9o_bTcm_ACtkZbk</recordid><startdate>20230808</startdate><enddate>20230808</enddate><creator>Espley, Samuel G</creator><creator>Farrar, Elliot H. E</creator><creator>Buttar, David</creator><creator>Tomasi, Simone</creator><creator>Grayson, Matthew N</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-1135-9890</orcidid><orcidid>https://orcid.org/0000-0002-9373-7639</orcidid><orcidid>https://orcid.org/0000-0001-5466-023X</orcidid><orcidid>https://orcid.org/0000-0003-3350-2907</orcidid><orcidid>https://orcid.org/0000-0003-2116-7929</orcidid></search><sort><creationdate>20230808</creationdate><title>Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach</title><author>Espley, Samuel G ; Farrar, Elliot H. E ; Buttar, David ; Tomasi, Simone ; Grayson, Matthew N</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c289t-7f3557d5946427bf465424641a4a5a222d844b6a1481d307753268a4f856077b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Espley, Samuel G</creatorcontrib><creatorcontrib>Farrar, Elliot H. E</creatorcontrib><creatorcontrib>Buttar, David</creatorcontrib><creatorcontrib>Tomasi, Simone</creatorcontrib><creatorcontrib>Grayson, Matthew N</creatorcontrib><collection>CrossRef</collection><jtitle>Digital discovery</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Espley, Samuel G</au><au>Farrar, Elliot H. E</au><au>Buttar, David</au><au>Tomasi, Simone</au><au>Grayson, Matthew N</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach</atitle><jtitle>Digital discovery</jtitle><date>2023-08-08</date><risdate>2023</risdate><volume>2</volume><issue>4</issue><spage>941</spage><epage>951</epage><pages>941-951</pages><issn>2635-098X</issn><eissn>2635-098X</eissn><abstract>Machine learning (ML) models can, once trained, make reaction barrier predictions in seconds, which is orders of magnitude faster than quantum mechanical (QM) methods such as density functional theory (DFT). However, these ML models need to be trained on large datasets of typically thousands of expensive, high accuracy barriers and do not generalise well beyond the specific reaction for which they are trained. In this work, we demonstrate that transfer learning (TL) can be used to adapt pre-trained Diels-Alder barrier prediction neural networks (NNs) to make predictions for other pericyclic reactions using horizontal TL (hTL) and additionally, at higher levels of theory with diagonal TL (dTL). TL-derived predictions are possible with mean absolute errors (MAEs) below the accepted chemical accuracy threshold of 1 kcal mol −1 , a significant improvement on pre-TL prediction MAEs of >5 kcal mol −1 , and in extremely low data regimes, with as few as 33 and 39 new datapoints needed for hTL and dTL, respectively. Thus, hTL and dTL are powerful options for providing insight into reaction feasibility without the need for extensive high-throughput experimental or computational screening or large dataset generation for training bespoke ML models. Transfer learning (TL) is used to adapt existing neural networks to provide reaction barrier predictions for different reaction classes (horizontal TL) at higher levels of theory (diagonal TL) with tens of datapoints.</abstract><doi>10.1039/d3dd00085k</doi><tpages>11</tpages><orcidid>https://orcid.org/0000-0002-1135-9890</orcidid><orcidid>https://orcid.org/0000-0002-9373-7639</orcidid><orcidid>https://orcid.org/0000-0001-5466-023X</orcidid><orcidid>https://orcid.org/0000-0003-3350-2907</orcidid><orcidid>https://orcid.org/0000-0003-2116-7929</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2635-098X
ispartof	Digital discovery, 2023-08, Vol.2 (4), p.941-951
issn	2635-098X 2635-098X
language	eng
recordid	cdi_crossref_primary_10_1039_D3DD00085K
source	Alma/SFX Local Collection
title	Machine learning reaction barriers in low data regimes: a horizontal and diagonal transfer learning approach
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T12%3A52%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-rsc_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Machine%20learning%20reaction%20barriers%20in%20low%20data%20regimes:%20a%20horizontal%20and%20diagonal%20transfer%20learning%20approach&rft.jtitle=Digital%20discovery&rft.au=Espley,%20Samuel%20G&rft.date=2023-08-08&rft.volume=2&rft.issue=4&rft.spage=941&rft.epage=951&rft.pages=941-951&rft.issn=2635-098X&rft.eissn=2635-098X&rft_id=info:doi/10.1039/d3dd00085k&rft_dat=%3Crsc_cross%3Ed3dd00085k%3C/rsc_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c289t-7f3557d5946427bf465424641a4a5a222d844b6a1481d307753268a4f856077b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true