Loading…

Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction

Abstract Motivation Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics 2018-04, Vol.34 (7), p.1132-1140
Main Authors: de Oliveira, Saulo H P, Law, Eleanor C, Shi, Jiye, Deane, Charlotte M
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c405t-c0eb2eb12425fbf3368e4f9f0d3ba452696606d54f18ca82f692977837c156303
cites cdi_FETCH-LOGICAL-c405t-c0eb2eb12425fbf3368e4f9f0d3ba452696606d54f18ca82f692977837c156303
container_end_page 1140
container_issue 7
container_start_page 1132
container_title Bioinformatics
container_volume 34
creator de Oliveira, Saulo H P
Law, Eleanor C
Shi, Jiye
Deane, Charlotte M
description Abstract Motivation Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally. Results We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score > 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy. Availability and implementation Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2. Supplementary information Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/btx722
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6030820</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/bioinformatics/btx722</oup_id><sourcerecordid>1964700981</sourcerecordid><originalsourceid>FETCH-LOGICAL-c405t-c0eb2eb12425fbf3368e4f9f0d3ba452696606d54f18ca82f692977837c156303</originalsourceid><addsrcrecordid>eNqNkctu1jAQhS0Eohd4hCIvWTTt-BIn3iChqjepEgtgbTnOuDVK4mA7VXl7XP1tRXesZqT5zpnRHEKOGJww0OJ0CDEsPqbZluDy6VAeOs7fkH0mFTQcWv229kJ1jexB7JGDnH8BtExK-Z7scV1HoPt9gt_x94ZLCXaiGW1yd3RCO2ZaIvU2F0zHdI4JKXofXKgk9cnezrVpBptxpCPSJd5HuqZYMCw0l7S5slXJmnAMroS4fCDvvJ0yfnyqh-TnxfmPs6vm5tvl9dnXm8ZJaEvjAAeOA-OSt37wQqgepdceRjFY2XKllQI1ttKz3tmee6W57rpedI61SoA4JF92vus2zDi6emWyk1lTmG36Y6IN5vVkCXfmNt4bBQJ6_mjw-ckgxfqXXMwcssNpsgvGLRumleygfo5VtN2hLsWcE_qXNQzMY0TmdURmF1HVffr3xhfVcyYVgB0Qt_U_Pf8CQfqmqA</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1964700981</pqid></control><display><type>article</type><title>Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction</title><source>Oxford University Press Open Access</source><source>PubMed Central</source><creator>de Oliveira, Saulo H P ; Law, Eleanor C ; Shi, Jiye ; Deane, Charlotte M</creator><contributor>Valencia, Alfonso</contributor><creatorcontrib>de Oliveira, Saulo H P ; Law, Eleanor C ; Shi, Jiye ; Deane, Charlotte M ; Valencia, Alfonso</creatorcontrib><description>Abstract Motivation Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally. Results We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score &gt; 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy. Availability and implementation Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2. Supplementary information Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1460-2059</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btx722</identifier><identifier>PMID: 29136098</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Algorithms ; Animals ; Caspase 12 - chemistry ; Caspase 12 - metabolism ; Computational Biology - methods ; Humans ; Original Papers ; Protein Conformation ; Sequence Analysis, Protein - methods ; Software</subject><ispartof>Bioinformatics, 2018-04, Vol.34 (7), p.1132-1140</ispartof><rights>The Author 2017. Published by Oxford University Press. 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c405t-c0eb2eb12425fbf3368e4f9f0d3ba452696606d54f18ca82f692977837c156303</citedby><cites>FETCH-LOGICAL-c405t-c0eb2eb12425fbf3368e4f9f0d3ba452696606d54f18ca82f692977837c156303</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6030820/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC6030820/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,1604,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29136098$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Valencia, Alfonso</contributor><creatorcontrib>de Oliveira, Saulo H P</creatorcontrib><creatorcontrib>Law, Eleanor C</creatorcontrib><creatorcontrib>Shi, Jiye</creatorcontrib><creatorcontrib>Deane, Charlotte M</creatorcontrib><title>Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Abstract Motivation Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally. Results We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score &gt; 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy. Availability and implementation Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2. Supplementary information Supplementary data are available at Bioinformatics online.</description><subject>Algorithms</subject><subject>Animals</subject><subject>Caspase 12 - chemistry</subject><subject>Caspase 12 - metabolism</subject><subject>Computational Biology - methods</subject><subject>Humans</subject><subject>Original Papers</subject><subject>Protein Conformation</subject><subject>Sequence Analysis, Protein - methods</subject><subject>Software</subject><issn>1367-4803</issn><issn>1460-2059</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqNkctu1jAQhS0Eohd4hCIvWTTt-BIn3iChqjepEgtgbTnOuDVK4mA7VXl7XP1tRXesZqT5zpnRHEKOGJww0OJ0CDEsPqbZluDy6VAeOs7fkH0mFTQcWv229kJ1jexB7JGDnH8BtExK-Z7scV1HoPt9gt_x94ZLCXaiGW1yd3RCO2ZaIvU2F0zHdI4JKXofXKgk9cnezrVpBptxpCPSJd5HuqZYMCw0l7S5slXJmnAMroS4fCDvvJ0yfnyqh-TnxfmPs6vm5tvl9dnXm8ZJaEvjAAeOA-OSt37wQqgepdceRjFY2XKllQI1ttKz3tmee6W57rpedI61SoA4JF92vus2zDi6emWyk1lTmG36Y6IN5vVkCXfmNt4bBQJ6_mjw-ckgxfqXXMwcssNpsgvGLRumleygfo5VtN2hLsWcE_qXNQzMY0TmdURmF1HVffr3xhfVcyYVgB0Qt_U_Pf8CQfqmqA</recordid><startdate>20180401</startdate><enddate>20180401</enddate><creator>de Oliveira, Saulo H P</creator><creator>Law, Eleanor C</creator><creator>Shi, Jiye</creator><creator>Deane, Charlotte M</creator><general>Oxford University Press</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20180401</creationdate><title>Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction</title><author>de Oliveira, Saulo H P ; Law, Eleanor C ; Shi, Jiye ; Deane, Charlotte M</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c405t-c0eb2eb12425fbf3368e4f9f0d3ba452696606d54f18ca82f692977837c156303</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Algorithms</topic><topic>Animals</topic><topic>Caspase 12 - chemistry</topic><topic>Caspase 12 - metabolism</topic><topic>Computational Biology - methods</topic><topic>Humans</topic><topic>Original Papers</topic><topic>Protein Conformation</topic><topic>Sequence Analysis, Protein - methods</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>de Oliveira, Saulo H P</creatorcontrib><creatorcontrib>Law, Eleanor C</creatorcontrib><creatorcontrib>Shi, Jiye</creatorcontrib><creatorcontrib>Deane, Charlotte M</creatorcontrib><collection>Oxford University Press Open Access</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>de Oliveira, Saulo H P</au><au>Law, Eleanor C</au><au>Shi, Jiye</au><au>Deane, Charlotte M</au><au>Valencia, Alfonso</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2018-04-01</date><risdate>2018</risdate><volume>34</volume><issue>7</issue><spage>1132</spage><epage>1140</epage><pages>1132-1140</pages><issn>1367-4803</issn><eissn>1460-2059</eissn><eissn>1367-4811</eissn><abstract>Abstract Motivation Most current de novo structure prediction methods randomly sample protein conformations and thus require large amounts of computational resource. Here, we consider a sequential sampling strategy, building on ideas from recent experimental work which shows that many proteins fold cotranslationally. Results We have investigated whether a pseudo-greedy search approach, which begins sequentially from one of the termini, can improve the performance and accuracy of de novo protein structure prediction. We observed that our sequential approach converges when fewer than 20 000 decoys have been produced, fewer than commonly expected. Using our software, SAINT2, we also compared the run time and quality of models produced in a sequential fashion against a standard, non-sequential approach. Sequential prediction produces an individual decoy 1.5-2.5 times faster than non-sequential prediction. When considering the quality of the best model, sequential prediction led to a better model being produced for 31 out of 41 soluble protein validation cases and for 18 out of 24 transmembrane protein cases. Correct models (TM-Score &gt; 0.5) were produced for 29 of these cases by the sequential mode and for only 22 by the non-sequential mode. Our comparison reveals that a sequential search strategy can be used to drastically reduce computational time of de novo protein structure prediction and improve accuracy. Availability and implementation Data are available for download from: http://opig.stats.ox.ac.uk/resources. SAINT2 is available for download from: https://github.com/sauloho/SAINT2. Supplementary information Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>29136098</pmid><doi>10.1093/bioinformatics/btx722</doi><tpages>9</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4803
ispartof Bioinformatics, 2018-04, Vol.34 (7), p.1132-1140
issn 1367-4803
1460-2059
1367-4811
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_6030820
source Oxford University Press Open Access; PubMed Central
subjects Algorithms
Animals
Caspase 12 - chemistry
Caspase 12 - metabolism
Computational Biology - methods
Humans
Original Papers
Protein Conformation
Sequence Analysis, Protein - methods
Software
title Sequential search leads to faster, more efficient fragment-based de novo protein structure prediction
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-27T16%3A56%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sequential%20search%20leads%20to%20faster,%20more%20efficient%20fragment-based%20de%20novo%20protein%20structure%20prediction&rft.jtitle=Bioinformatics&rft.au=de%20Oliveira,%20Saulo%20H%20P&rft.date=2018-04-01&rft.volume=34&rft.issue=7&rft.spage=1132&rft.epage=1140&rft.pages=1132-1140&rft.issn=1367-4803&rft.eissn=1460-2059&rft_id=info:doi/10.1093/bioinformatics/btx722&rft_dat=%3Cproquest_pubme%3E1964700981%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c405t-c0eb2eb12425fbf3368e4f9f0d3ba452696606d54f18ca82f692977837c156303%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1964700981&rft_id=info:pmid/29136098&rft_oup_id=10.1093/bioinformatics/btx722&rfr_iscdi=true