Loading…

Analysis of High-Dimensional Structure-Activity Screening Datasets Using the Optimal Bit String Tree

We propose a new classification method called the Optimal Bit String Tree (OBSTree) to identify quantitative structure-activity relationships (QSARs). The method introduces the concept of a chromosome to describe the presence/absence context of a combination of descriptors. A descriptor set and its...

Full description

Saved in:
Bibliographic Details
Published in:Technometrics 2013-05, Vol.55 (2), p.161-173
Main Authors: Zhang, Ke, Hughes-Oliver, Jacqueline M., Young, S. Stanley
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c514t-d0bcb244c169eae3105a9323355aaef7b34be87e45172dbc322b9efdb3a350053
cites cdi_FETCH-LOGICAL-c514t-d0bcb244c169eae3105a9323355aaef7b34be87e45172dbc322b9efdb3a350053
container_end_page 173
container_issue 2
container_start_page 161
container_title Technometrics
container_volume 55
creator Zhang, Ke
Hughes-Oliver, Jacqueline M.
Young, S. Stanley
description We propose a new classification method called the Optimal Bit String Tree (OBSTree) to identify quantitative structure-activity relationships (QSARs). The method introduces the concept of a chromosome to describe the presence/absence context of a combination of descriptors. A descriptor set and its optimal chromosome form the splitting variable. A new stochastic searching scheme that contains a weighted sampling scheme, simulated annealing, and a trimming procedure optimizes the choice of splitting variable. Simulation studies and an application to screening monoamine oxidase inhibitors show that OBSTree is advantageous in accurately and effectively identifying QSAR rules and finding different classes of active compounds. Details of the algorithm, SAS code, and simulated and real datasets are available online as supplementary materials.
doi_str_mv 10.1080/00401706.2012.760489
format article
fullrecord <record><control><sourceid>jstor_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3714111</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>24587125</jstor_id><sourcerecordid>24587125</sourcerecordid><originalsourceid>FETCH-LOGICAL-c514t-d0bcb244c169eae3105a9323355aaef7b34be87e45172dbc322b9efdb3a350053</originalsourceid><addsrcrecordid>eNp9kU1v1DAQhi0EokvhHwCKxIVLtv6MvRfQtgVaqVIPbc-W40x2vUrixXZa7b_HUdryceBkeeZ5R-_Mi9B7gpcEK3yCMcdE4mpJMaFLWWGuVi_QgggmSyope4kWE1JOzBF6E-MOY8Kokq_REWVKKo7lAjXrwXSH6GLh2-LCbbbluethiM7nenGTwmjTGKBc2-TuXToUNzYADG7YFOcmmQgpFndx-qYtFNf75PqsO3Vp0k7l24y_Ra9a00V49_geo7vv327PLsqr6x-XZ-ur0grCU9ng2taUc0uqFRhgBAuzYpQxIYyBVtaM16AkcEEkbWrLKK1X0DY1M0xgLNgx-jLP3Y91D42FIQXT6X3IpsJBe-P0353BbfXG32smCSeE5AGfHwcE_3OEmHTvooWuMwP4MWqiaCUkrZTM6Kd_0J0fQz5appioMGVSsUzxmbLBxxigfTZDsJ5i1E8x6ilGPceYZR__XORZ9JRbBj7MwC4mH373uVCS0OkSX-e-G1ofevPgQ9foZA6dD20wg3VRs_9a-AX3arZw</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1356023783</pqid></control><display><type>article</type><title>Analysis of High-Dimensional Structure-Activity Screening Datasets Using the Optimal Bit String Tree</title><source>JSTOR Archival Journals and Primary Sources Collection</source><source>Taylor and Francis Science and Technology Collection</source><creator>Zhang, Ke ; Hughes-Oliver, Jacqueline M. ; Young, S. Stanley</creator><creatorcontrib>Zhang, Ke ; Hughes-Oliver, Jacqueline M. ; Young, S. Stanley</creatorcontrib><description>We propose a new classification method called the Optimal Bit String Tree (OBSTree) to identify quantitative structure-activity relationships (QSARs). The method introduces the concept of a chromosome to describe the presence/absence context of a combination of descriptors. A descriptor set and its optimal chromosome form the splitting variable. A new stochastic searching scheme that contains a weighted sampling scheme, simulated annealing, and a trimming procedure optimizes the choice of splitting variable. Simulation studies and an application to screening monoamine oxidase inhibitors show that OBSTree is advantageous in accurately and effectively identifying QSAR rules and finding different classes of active compounds. Details of the algorithm, SAS code, and simulated and real datasets are available online as supplementary materials.</description><identifier>ISSN: 0040-1706</identifier><identifier>EISSN: 1537-2723</identifier><identifier>DOI: 10.1080/00401706.2012.760489</identifier><identifier>PMID: 23878407</identifier><identifier>CODEN: TCMTA2</identifier><language>eng</language><publisher>United States: Taylor &amp; Francis Group</publisher><subject>Chemical compounds ; Chromosomes ; Classification ; Drug discovery ; High throughput screening ; Prediction ; QSAR ; Simulated annealing ; Simulation ; Stochastic models</subject><ispartof>Technometrics, 2013-05, Vol.55 (2), p.161-173</ispartof><rights>Copyright Taylor &amp; Francis Group, LLC 2013</rights><rights>2013 American Statistical Association and the American Society for Quality</rights><rights>Copyright Taylor &amp; Francis Ltd. 2013</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c514t-d0bcb244c169eae3105a9323355aaef7b34be87e45172dbc322b9efdb3a350053</citedby><cites>FETCH-LOGICAL-c514t-d0bcb244c169eae3105a9323355aaef7b34be87e45172dbc322b9efdb3a350053</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/24587125$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/24587125$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>230,314,780,784,885,27923,27924,58237,58470</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/23878407$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Ke</creatorcontrib><creatorcontrib>Hughes-Oliver, Jacqueline M.</creatorcontrib><creatorcontrib>Young, S. Stanley</creatorcontrib><title>Analysis of High-Dimensional Structure-Activity Screening Datasets Using the Optimal Bit String Tree</title><title>Technometrics</title><addtitle>Technometrics</addtitle><description>We propose a new classification method called the Optimal Bit String Tree (OBSTree) to identify quantitative structure-activity relationships (QSARs). The method introduces the concept of a chromosome to describe the presence/absence context of a combination of descriptors. A descriptor set and its optimal chromosome form the splitting variable. A new stochastic searching scheme that contains a weighted sampling scheme, simulated annealing, and a trimming procedure optimizes the choice of splitting variable. Simulation studies and an application to screening monoamine oxidase inhibitors show that OBSTree is advantageous in accurately and effectively identifying QSAR rules and finding different classes of active compounds. Details of the algorithm, SAS code, and simulated and real datasets are available online as supplementary materials.</description><subject>Chemical compounds</subject><subject>Chromosomes</subject><subject>Classification</subject><subject>Drug discovery</subject><subject>High throughput screening</subject><subject>Prediction</subject><subject>QSAR</subject><subject>Simulated annealing</subject><subject>Simulation</subject><subject>Stochastic models</subject><issn>0040-1706</issn><issn>1537-2723</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2013</creationdate><recordtype>article</recordtype><recordid>eNp9kU1v1DAQhi0EokvhHwCKxIVLtv6MvRfQtgVaqVIPbc-W40x2vUrixXZa7b_HUdryceBkeeZ5R-_Mi9B7gpcEK3yCMcdE4mpJMaFLWWGuVi_QgggmSyope4kWE1JOzBF6E-MOY8Kokq_REWVKKo7lAjXrwXSH6GLh2-LCbbbluethiM7nenGTwmjTGKBc2-TuXToUNzYADG7YFOcmmQgpFndx-qYtFNf75PqsO3Vp0k7l24y_Ra9a00V49_geo7vv327PLsqr6x-XZ-ur0grCU9ng2taUc0uqFRhgBAuzYpQxIYyBVtaM16AkcEEkbWrLKK1X0DY1M0xgLNgx-jLP3Y91D42FIQXT6X3IpsJBe-P0353BbfXG32smCSeE5AGfHwcE_3OEmHTvooWuMwP4MWqiaCUkrZTM6Kd_0J0fQz5appioMGVSsUzxmbLBxxigfTZDsJ5i1E8x6ilGPceYZR__XORZ9JRbBj7MwC4mH373uVCS0OkSX-e-G1ofevPgQ9foZA6dD20wg3VRs_9a-AX3arZw</recordid><startdate>20130501</startdate><enddate>20130501</enddate><creator>Zhang, Ke</creator><creator>Hughes-Oliver, Jacqueline M.</creator><creator>Young, S. Stanley</creator><general>Taylor &amp; Francis Group</general><general>American Society for Quality and the American Statistical Association</general><general>American Society for Quality</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20130501</creationdate><title>Analysis of High-Dimensional Structure-Activity Screening Datasets Using the Optimal Bit String Tree</title><author>Zhang, Ke ; Hughes-Oliver, Jacqueline M. ; Young, S. Stanley</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c514t-d0bcb244c169eae3105a9323355aaef7b34be87e45172dbc322b9efdb3a350053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2013</creationdate><topic>Chemical compounds</topic><topic>Chromosomes</topic><topic>Classification</topic><topic>Drug discovery</topic><topic>High throughput screening</topic><topic>Prediction</topic><topic>QSAR</topic><topic>Simulated annealing</topic><topic>Simulation</topic><topic>Stochastic models</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Ke</creatorcontrib><creatorcontrib>Hughes-Oliver, Jacqueline M.</creatorcontrib><creatorcontrib>Young, S. Stanley</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Technometrics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Ke</au><au>Hughes-Oliver, Jacqueline M.</au><au>Young, S. Stanley</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Analysis of High-Dimensional Structure-Activity Screening Datasets Using the Optimal Bit String Tree</atitle><jtitle>Technometrics</jtitle><addtitle>Technometrics</addtitle><date>2013-05-01</date><risdate>2013</risdate><volume>55</volume><issue>2</issue><spage>161</spage><epage>173</epage><pages>161-173</pages><issn>0040-1706</issn><eissn>1537-2723</eissn><coden>TCMTA2</coden><abstract>We propose a new classification method called the Optimal Bit String Tree (OBSTree) to identify quantitative structure-activity relationships (QSARs). The method introduces the concept of a chromosome to describe the presence/absence context of a combination of descriptors. A descriptor set and its optimal chromosome form the splitting variable. A new stochastic searching scheme that contains a weighted sampling scheme, simulated annealing, and a trimming procedure optimizes the choice of splitting variable. Simulation studies and an application to screening monoamine oxidase inhibitors show that OBSTree is advantageous in accurately and effectively identifying QSAR rules and finding different classes of active compounds. Details of the algorithm, SAS code, and simulated and real datasets are available online as supplementary materials.</abstract><cop>United States</cop><pub>Taylor &amp; Francis Group</pub><pmid>23878407</pmid><doi>10.1080/00401706.2012.760489</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0040-1706
ispartof Technometrics, 2013-05, Vol.55 (2), p.161-173
issn 0040-1706
1537-2723
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_3714111
source JSTOR Archival Journals and Primary Sources Collection; Taylor and Francis Science and Technology Collection
subjects Chemical compounds
Chromosomes
Classification
Drug discovery
High throughput screening
Prediction
QSAR
Simulated annealing
Simulation
Stochastic models
title Analysis of High-Dimensional Structure-Activity Screening Datasets Using the Optimal Bit String Tree
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T06%3A06%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Analysis%20of%20High-Dimensional%20Structure-Activity%20Screening%20Datasets%20Using%20the%20Optimal%20Bit%20String%20Tree&rft.jtitle=Technometrics&rft.au=Zhang,%20Ke&rft.date=2013-05-01&rft.volume=55&rft.issue=2&rft.spage=161&rft.epage=173&rft.pages=161-173&rft.issn=0040-1706&rft.eissn=1537-2723&rft.coden=TCMTA2&rft_id=info:doi/10.1080/00401706.2012.760489&rft_dat=%3Cjstor_pubme%3E24587125%3C/jstor_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c514t-d0bcb244c169eae3105a9323355aaef7b34be87e45172dbc322b9efdb3a350053%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1356023783&rft_id=info:pmid/23878407&rft_jstor_id=24587125&rfr_iscdi=true