Loading…

How to judge whether QSAR/read-across predictions can be trusted: a novel approach for establishing a model's applicability domain

The EU REACH legislation, the OECD and US EPA official guidance documents, as well as the 3Rs principle (replacement, reduction, refinement of animal testing), all advocate the necessity of developing comprehensive computational methods ( e.g. quantitative structure-activity relationship, read-acros...

Full description

Saved in:

Bibliographic Details
Published in:	Environmental science. Nano 2018, Vol.5 (2), p.48-421
Main Author:	Gajewicz, A
Format:	Article
Language:	English
Subjects:	Computation Computer applications Distance Learning algorithms Legislation Machine learning Modelling Nanoparticles Prediction models Probability theory Structure-activity relationships Toxic hazards Toxicology
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c307t-501263c383293643c92384b88d72dede013e8680c8f93783540377d72672b84a3
cites	cdi_FETCH-LOGICAL-c307t-501263c383293643c92384b88d72dede013e8680c8f93783540377d72672b84a3
container_end_page	421
container_issue	2
container_start_page	48
container_title	Environmental science. Nano
container_volume	5
creator	Gajewicz, A
description	The EU REACH legislation, the OECD and US EPA official guidance documents, as well as the 3Rs principle (replacement, reduction, refinement of animal testing), all advocate the necessity of developing comprehensive computational methods ( e.g. quantitative structure-activity relationship, read-across) that would enable the predictive modeling of both chemical ( e.g. nanoparticle) specific functionalities and their hazards. However, since computational (nano)toxicology continues to ' learn on the fly ' and relies on the use of a vast array of innovative machine-learning algorithms, serious concerns about the reliability of in silico predictions are raised. This study aimed to give an answer to the following question: how to judge whether QSAR/read-across predictions are reliable. Here, an effective approach for graphical assessment of the limits of a model's reliable predictions (so-called applicability domain, AD) was introduced. The probability-oriented distance-based approach (AD ProbDist ) was proposed as a robust and automatic method for defining the interpolation space where true and reliable predictions can be expected. Its usefulness was confirmed by using four nano-QSAR/read-across models recently reported in the literature. The results of the study showed that the AD ProbDist approach is more restrictive in terms of the chemical space that falls in the AD of a model than the range, geometrical, distance and leverage approaches. The advantages of the proposed AD ProbDist approach include (but are not limited to) the fact that it works with relatively small datasets and enables the identification of (un)reliable predictions for newly screened chemicals without experimental data. Further, to facilitate the use of the AD ProbDist approach, this study provides the developed in-house R -codes. Probability-oriented distance-based approach (AD ProbDist ) for determining the nano-QSAR/read-across model's applicability domain where true and reliable predictions can be expected.
doi_str_mv	10.1039/c7en00774d
format	article
fullrecord	<record><control><sourceid>proquest_rsc_p</sourceid><recordid>TN_cdi_rsc_primary_c7en00774d</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2010870881</sourcerecordid><originalsourceid>FETCH-LOGICAL-c307t-501263c383293643c92384b88d72dede013e8680c8f93783540377d72672b84a3</originalsourceid><addsrcrecordid>eNpFkd9LwzAQx4soOOZefBcCPghC3SVpm9S3MacTRPHXc0mTdM3ompq0jr36l9ttMp_u4D7cfb_fC4JzDDcYaDqWTNcAjEXqKBgQiHHIcYKPD31MT4OR90sAwJjENGGD4Gdu16i1aNmphUbrUrelduj1ffI2dlqoUEhnvUeN08rI1tjaIylqlGvUus63Wt0igWr7rSskmsZZIUtUWIe0b0VeGV-aetETK6t0deW3TGWkyE1l2g1SdiVMfRacFKLyevRXh8Hn_exjOg-fXh4ep5OnUFJgbRgDJgmVlFOS0iSiMiWURznnihGllQZMNU84SF6klHEaR0AZ64cJIzmPBB0Gl_u9vcyvrheYLW3n6v5kRgADZ8A57qnrPbUz7nSRNc6shNtkGLJtzNmUzZ53Md_18MUedl4euP830F-3wniZ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2010870881</pqid></control><display><type>article</type><title>How to judge whether QSAR/read-across predictions can be trusted: a novel approach for establishing a model's applicability domain</title><source>Royal Society of Chemistry:Jisc Collections:Royal Society of Chemistry Read and Publish 2022-2024 (reading list)</source><creator>Gajewicz, A</creator><creatorcontrib>Gajewicz, A</creatorcontrib><description>The EU REACH legislation, the OECD and US EPA official guidance documents, as well as the 3Rs principle (replacement, reduction, refinement of animal testing), all advocate the necessity of developing comprehensive computational methods ( e.g. quantitative structure-activity relationship, read-across) that would enable the predictive modeling of both chemical ( e.g. nanoparticle) specific functionalities and their hazards. However, since computational (nano)toxicology continues to ' learn on the fly ' and relies on the use of a vast array of innovative machine-learning algorithms, serious concerns about the reliability of in silico predictions are raised. This study aimed to give an answer to the following question: how to judge whether QSAR/read-across predictions are reliable. Here, an effective approach for graphical assessment of the limits of a model's reliable predictions (so-called applicability domain, AD) was introduced. The probability-oriented distance-based approach (AD ProbDist ) was proposed as a robust and automatic method for defining the interpolation space where true and reliable predictions can be expected. Its usefulness was confirmed by using four nano-QSAR/read-across models recently reported in the literature. The results of the study showed that the AD ProbDist approach is more restrictive in terms of the chemical space that falls in the AD of a model than the range, geometrical, distance and leverage approaches. The advantages of the proposed AD ProbDist approach include (but are not limited to) the fact that it works with relatively small datasets and enables the identification of (un)reliable predictions for newly screened chemicals without experimental data. Further, to facilitate the use of the AD ProbDist approach, this study provides the developed in-house R -codes. Probability-oriented distance-based approach (AD ProbDist ) for determining the nano-QSAR/read-across model's applicability domain where true and reliable predictions can be expected.</description><identifier>ISSN: 2051-8153</identifier><identifier>EISSN: 2051-8161</identifier><identifier>DOI: 10.1039/c7en00774d</identifier><language>eng</language><publisher>Cambridge: Royal Society of Chemistry</publisher><subject>Computation ; Computer applications ; Distance ; Learning algorithms ; Legislation ; Machine learning ; Modelling ; Nanoparticles ; Prediction models ; Probability theory ; Structure-activity relationships ; Toxic hazards ; Toxicology</subject><ispartof>Environmental science. Nano, 2018, Vol.5 (2), p.48-421</ispartof><rights>Copyright Royal Society of Chemistry 2018</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c307t-501263c383293643c92384b88d72dede013e8680c8f93783540377d72672b84a3</citedby><cites>FETCH-LOGICAL-c307t-501263c383293643c92384b88d72dede013e8680c8f93783540377d72672b84a3</cites><orcidid>0000-0001-7702-210X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,4010,27900,27901,27902</link.rule.ids></links><search><creatorcontrib>Gajewicz, A</creatorcontrib><title>How to judge whether QSAR/read-across predictions can be trusted: a novel approach for establishing a model's applicability domain</title><title>Environmental science. Nano</title><description>The EU REACH legislation, the OECD and US EPA official guidance documents, as well as the 3Rs principle (replacement, reduction, refinement of animal testing), all advocate the necessity of developing comprehensive computational methods ( e.g. quantitative structure-activity relationship, read-across) that would enable the predictive modeling of both chemical ( e.g. nanoparticle) specific functionalities and their hazards. However, since computational (nano)toxicology continues to ' learn on the fly ' and relies on the use of a vast array of innovative machine-learning algorithms, serious concerns about the reliability of in silico predictions are raised. This study aimed to give an answer to the following question: how to judge whether QSAR/read-across predictions are reliable. Here, an effective approach for graphical assessment of the limits of a model's reliable predictions (so-called applicability domain, AD) was introduced. The probability-oriented distance-based approach (AD ProbDist ) was proposed as a robust and automatic method for defining the interpolation space where true and reliable predictions can be expected. Its usefulness was confirmed by using four nano-QSAR/read-across models recently reported in the literature. The results of the study showed that the AD ProbDist approach is more restrictive in terms of the chemical space that falls in the AD of a model than the range, geometrical, distance and leverage approaches. The advantages of the proposed AD ProbDist approach include (but are not limited to) the fact that it works with relatively small datasets and enables the identification of (un)reliable predictions for newly screened chemicals without experimental data. Further, to facilitate the use of the AD ProbDist approach, this study provides the developed in-house R -codes. Probability-oriented distance-based approach (AD ProbDist ) for determining the nano-QSAR/read-across model's applicability domain where true and reliable predictions can be expected.</description><subject>Computation</subject><subject>Computer applications</subject><subject>Distance</subject><subject>Learning algorithms</subject><subject>Legislation</subject><subject>Machine learning</subject><subject>Modelling</subject><subject>Nanoparticles</subject><subject>Prediction models</subject><subject>Probability theory</subject><subject>Structure-activity relationships</subject><subject>Toxic hazards</subject><subject>Toxicology</subject><issn>2051-8153</issn><issn>2051-8161</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNpFkd9LwzAQx4soOOZefBcCPghC3SVpm9S3MacTRPHXc0mTdM3ompq0jr36l9ttMp_u4D7cfb_fC4JzDDcYaDqWTNcAjEXqKBgQiHHIcYKPD31MT4OR90sAwJjENGGD4Gdu16i1aNmphUbrUrelduj1ffI2dlqoUEhnvUeN08rI1tjaIylqlGvUus63Wt0igWr7rSskmsZZIUtUWIe0b0VeGV-aetETK6t0deW3TGWkyE1l2g1SdiVMfRacFKLyevRXh8Hn_exjOg-fXh4ep5OnUFJgbRgDJgmVlFOS0iSiMiWURznnihGllQZMNU84SF6klHEaR0AZ64cJIzmPBB0Gl_u9vcyvrheYLW3n6v5kRgADZ8A57qnrPbUz7nSRNc6shNtkGLJtzNmUzZ53Md_18MUedl4euP830F-3wniZ</recordid><startdate>2018</startdate><enddate>2018</enddate><creator>Gajewicz, A</creator><general>Royal Society of Chemistry</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7QH</scope><scope>7ST</scope><scope>7UA</scope><scope>C1K</scope><scope>F1W</scope><scope>H97</scope><scope>L.G</scope><scope>SOI</scope><orcidid>https://orcid.org/0000-0001-7702-210X</orcidid></search><sort><creationdate>2018</creationdate><title>How to judge whether QSAR/read-across predictions can be trusted: a novel approach for establishing a model's applicability domain</title><author>Gajewicz, A</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c307t-501263c383293643c92384b88d72dede013e8680c8f93783540377d72672b84a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Computation</topic><topic>Computer applications</topic><topic>Distance</topic><topic>Learning algorithms</topic><topic>Legislation</topic><topic>Machine learning</topic><topic>Modelling</topic><topic>Nanoparticles</topic><topic>Prediction models</topic><topic>Probability theory</topic><topic>Structure-activity relationships</topic><topic>Toxic hazards</topic><topic>Toxicology</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Gajewicz, A</creatorcontrib><collection>CrossRef</collection><collection>Aqualine</collection><collection>Environment Abstracts</collection><collection>Water Resources Abstracts</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ASFA: Aquatic Sciences and Fisheries Abstracts</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) 3: Aquatic Pollution & Environmental Quality</collection><collection>Aquatic Science & Fisheries Abstracts (ASFA) Professional</collection><collection>Environment Abstracts</collection><jtitle>Environmental science. Nano</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Gajewicz, A</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>How to judge whether QSAR/read-across predictions can be trusted: a novel approach for establishing a model's applicability domain</atitle><jtitle>Environmental science. Nano</jtitle><date>2018</date><risdate>2018</risdate><volume>5</volume><issue>2</issue><spage>48</spage><epage>421</epage><pages>48-421</pages><issn>2051-8153</issn><eissn>2051-8161</eissn><abstract>The EU REACH legislation, the OECD and US EPA official guidance documents, as well as the 3Rs principle (replacement, reduction, refinement of animal testing), all advocate the necessity of developing comprehensive computational methods ( e.g. quantitative structure-activity relationship, read-across) that would enable the predictive modeling of both chemical ( e.g. nanoparticle) specific functionalities and their hazards. However, since computational (nano)toxicology continues to ' learn on the fly ' and relies on the use of a vast array of innovative machine-learning algorithms, serious concerns about the reliability of in silico predictions are raised. This study aimed to give an answer to the following question: how to judge whether QSAR/read-across predictions are reliable. Here, an effective approach for graphical assessment of the limits of a model's reliable predictions (so-called applicability domain, AD) was introduced. The probability-oriented distance-based approach (AD ProbDist ) was proposed as a robust and automatic method for defining the interpolation space where true and reliable predictions can be expected. Its usefulness was confirmed by using four nano-QSAR/read-across models recently reported in the literature. The results of the study showed that the AD ProbDist approach is more restrictive in terms of the chemical space that falls in the AD of a model than the range, geometrical, distance and leverage approaches. The advantages of the proposed AD ProbDist approach include (but are not limited to) the fact that it works with relatively small datasets and enables the identification of (un)reliable predictions for newly screened chemicals without experimental data. Further, to facilitate the use of the AD ProbDist approach, this study provides the developed in-house R -codes. Probability-oriented distance-based approach (AD ProbDist ) for determining the nano-QSAR/read-across model's applicability domain where true and reliable predictions can be expected.</abstract><cop>Cambridge</cop><pub>Royal Society of Chemistry</pub><doi>10.1039/c7en00774d</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-7702-210X</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 2051-8153
ispartof	Environmental science. Nano, 2018, Vol.5 (2), p.48-421
issn	2051-8153 2051-8161
language	eng
recordid	cdi_rsc_primary_c7en00774d
source	Royal Society of Chemistry:Jisc Collections:Royal Society of Chemistry Read and Publish 2022-2024 (reading list)
subjects	Computation Computer applications Distance Learning algorithms Legislation Machine learning Modelling Nanoparticles Prediction models Probability theory Structure-activity relationships Toxic hazards Toxicology
title	How to judge whether QSAR/read-across predictions can be trusted: a novel approach for establishing a model's applicability domain
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T14%3A02%3A38IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_rsc_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=How%20to%20judge%20whether%20QSAR/read-across%20predictions%20can%20be%20trusted:%20a%20novel%20approach%20for%20establishing%20a%20model's%20applicability%20domain&rft.jtitle=Environmental%20science.%20Nano&rft.au=Gajewicz,%20A&rft.date=2018&rft.volume=5&rft.issue=2&rft.spage=48&rft.epage=421&rft.pages=48-421&rft.issn=2051-8153&rft.eissn=2051-8161&rft_id=info:doi/10.1039/c7en00774d&rft_dat=%3Cproquest_rsc_p%3E2010870881%3C/proquest_rsc_p%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c307t-501263c383293643c92384b88d72dede013e8680c8f93783540377d72672b84a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2010870881&rft_id=info:pmid/&rfr_iscdi=true