Loading…
Broccoli: Combining Phylogenetic and Network Analyses for Orthology Assignment
Abstract Orthology assignment is a key step of comparative genomic studies, for which many bioinformatic tools have been developed. However, all gene clustering pipelines are based on the analysis of protein distances, which are subject to many artifacts. In this article, we introduce Broccoli, a us...
Saved in:
Published in: | Molecular biology and evolution 2020-11, Vol.37 (11), p.3389-3396 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c403t-a2632c51dfd3efd0c83438375ec040716c2085e4d7ad04b2985797e65327e3b03 |
---|---|
cites | cdi_FETCH-LOGICAL-c403t-a2632c51dfd3efd0c83438375ec040716c2085e4d7ad04b2985797e65327e3b03 |
container_end_page | 3396 |
container_issue | 11 |
container_start_page | 3389 |
container_title | Molecular biology and evolution |
container_volume | 37 |
creator | Derelle, Romain Philippe, Hervé Colbourne, John K |
description | Abstract
Orthology assignment is a key step of comparative genomic studies, for which many bioinformatic tools have been developed. However, all gene clustering pipelines are based on the analysis of protein distances, which are subject to many artifacts. In this article, we introduce Broccoli, a user-friendly pipeline designed to infer, with high precision, orthologous groups, and pairs of proteins using a phylogeny-based approach. Briefly, Broccoli performs ultrafast phylogenetic analyses on most proteins and builds a network of orthologous relationships. Orthologous groups are then identified from the network using a parameter-free machine learning algorithm. Broccoli is also able to detect chimeric proteins resulting from gene-fusion events and to assign these proteins to the corresponding orthologous groups. Tested on two benchmark data sets, Broccoli outperforms current orthology pipelines. In addition, Broccoli is scalable, with runtimes similar to those of recent distance-based pipelines. Given its high level of performance and efficiency, this new pipeline represents a suitable choice for comparative genomic studies. Broccoli is freely available at https://github.com/rderelle/Broccoli. |
doi_str_mv | 10.1093/molbev/msaa159 |
format | article |
fullrecord | <record><control><sourceid>proquest_hal_p</sourceid><recordid>TN_cdi_hal_primary_oai_HAL_hal_03100139v1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/molbev/msaa159</oup_id><sourcerecordid>2419087359</sourcerecordid><originalsourceid>FETCH-LOGICAL-c403t-a2632c51dfd3efd0c83438375ec040716c2085e4d7ad04b2985797e65327e3b03</originalsourceid><addsrcrecordid>eNqFkDFPwzAQRi0EoqWwMqKMMKS14yR22EoFFKlqGWC2HOfSGpK42ElR_j2pUsrIdKfTu6e7D6FrgscEJ3RSmiKF3aR0UpIoOUFDElHmE0aSUzTErOtDTPkAXTj3gTEJwzg-RwMaxDjgnA_R8sEapUyh772ZKVNd6WrtvW7awqyhglorT1aZt4T629hPb1rJonXgvNxYb2Xrjem41ps6p9dVCVV9ic5yWTi4OtQRen96fJvN_cXq-WU2XfiqO6f2ZRDTQEUkyzMKeYYVpyHllEWgcIgZiVWAeQRhxmSGwzRIeMQSBnFEAwY0xXSE7nrvRhZia3UpbSuM1GI-XYj9DFPSvUuTHenY257dWvPVgKtFqZ2CopAVmMaJICQJ5oxGSYeOe1RZ45yF_OgmWOzzFn3e4pB3t3BzcDdpCdkR_w3471DTbP-T_QCkXYr6</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2419087359</pqid></control><display><type>article</type><title>Broccoli: Combining Phylogenetic and Network Analyses for Orthology Assignment</title><source>PubMed (Medline)</source><source>Open Access: Oxford University Press Open Journals</source><source>Full-Text Journals in Chemistry (Open access)</source><creator>Derelle, Romain ; Philippe, Hervé ; Colbourne, John K</creator><contributor>Falush, Daniel</contributor><creatorcontrib>Derelle, Romain ; Philippe, Hervé ; Colbourne, John K ; Falush, Daniel</creatorcontrib><description>Abstract
Orthology assignment is a key step of comparative genomic studies, for which many bioinformatic tools have been developed. However, all gene clustering pipelines are based on the analysis of protein distances, which are subject to many artifacts. In this article, we introduce Broccoli, a user-friendly pipeline designed to infer, with high precision, orthologous groups, and pairs of proteins using a phylogeny-based approach. Briefly, Broccoli performs ultrafast phylogenetic analyses on most proteins and builds a network of orthologous relationships. Orthologous groups are then identified from the network using a parameter-free machine learning algorithm. Broccoli is also able to detect chimeric proteins resulting from gene-fusion events and to assign these proteins to the corresponding orthologous groups. Tested on two benchmark data sets, Broccoli outperforms current orthology pipelines. In addition, Broccoli is scalable, with runtimes similar to those of recent distance-based pipelines. Given its high level of performance and efficiency, this new pipeline represents a suitable choice for comparative genomic studies. Broccoli is freely available at https://github.com/rderelle/Broccoli.</description><identifier>ISSN: 0737-4038</identifier><identifier>EISSN: 1537-1719</identifier><identifier>DOI: 10.1093/molbev/msaa159</identifier><identifier>PMID: 32602888</identifier><language>eng</language><publisher>United States: Oxford University Press</publisher><subject>Genomics - methods ; Life Sciences ; Mutant Chimeric Proteins ; Phylogeny ; Quantitative Methods ; Software</subject><ispartof>Molecular biology and evolution, 2020-11, Vol.37 (11), p.3389-3396</ispartof><rights>The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. 2020</rights><rights>The Author(s) 2020. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.</rights><rights>Distributed under a Creative Commons Attribution 4.0 International License</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c403t-a2632c51dfd3efd0c83438375ec040716c2085e4d7ad04b2985797e65327e3b03</citedby><cites>FETCH-LOGICAL-c403t-a2632c51dfd3efd0c83438375ec040716c2085e4d7ad04b2985797e65327e3b03</cites><orcidid>0000-0002-1335-8015</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,1603,27923,27924</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32602888$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink><backlink>$$Uhttps://hal.science/hal-03100139$$DView record in HAL$$Hfree_for_read</backlink></links><search><contributor>Falush, Daniel</contributor><creatorcontrib>Derelle, Romain</creatorcontrib><creatorcontrib>Philippe, Hervé</creatorcontrib><creatorcontrib>Colbourne, John K</creatorcontrib><title>Broccoli: Combining Phylogenetic and Network Analyses for Orthology Assignment</title><title>Molecular biology and evolution</title><addtitle>Mol Biol Evol</addtitle><description>Abstract
Orthology assignment is a key step of comparative genomic studies, for which many bioinformatic tools have been developed. However, all gene clustering pipelines are based on the analysis of protein distances, which are subject to many artifacts. In this article, we introduce Broccoli, a user-friendly pipeline designed to infer, with high precision, orthologous groups, and pairs of proteins using a phylogeny-based approach. Briefly, Broccoli performs ultrafast phylogenetic analyses on most proteins and builds a network of orthologous relationships. Orthologous groups are then identified from the network using a parameter-free machine learning algorithm. Broccoli is also able to detect chimeric proteins resulting from gene-fusion events and to assign these proteins to the corresponding orthologous groups. Tested on two benchmark data sets, Broccoli outperforms current orthology pipelines. In addition, Broccoli is scalable, with runtimes similar to those of recent distance-based pipelines. Given its high level of performance and efficiency, this new pipeline represents a suitable choice for comparative genomic studies. Broccoli is freely available at https://github.com/rderelle/Broccoli.</description><subject>Genomics - methods</subject><subject>Life Sciences</subject><subject>Mutant Chimeric Proteins</subject><subject>Phylogeny</subject><subject>Quantitative Methods</subject><subject>Software</subject><issn>0737-4038</issn><issn>1537-1719</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>TOX</sourceid><recordid>eNqFkDFPwzAQRi0EoqWwMqKMMKS14yR22EoFFKlqGWC2HOfSGpK42ElR_j2pUsrIdKfTu6e7D6FrgscEJ3RSmiKF3aR0UpIoOUFDElHmE0aSUzTErOtDTPkAXTj3gTEJwzg-RwMaxDjgnA_R8sEapUyh772ZKVNd6WrtvW7awqyhglorT1aZt4T629hPb1rJonXgvNxYb2Xrjem41ps6p9dVCVV9ic5yWTi4OtQRen96fJvN_cXq-WU2XfiqO6f2ZRDTQEUkyzMKeYYVpyHllEWgcIgZiVWAeQRhxmSGwzRIeMQSBnFEAwY0xXSE7nrvRhZia3UpbSuM1GI-XYj9DFPSvUuTHenY257dWvPVgKtFqZ2CopAVmMaJICQJ5oxGSYeOe1RZ45yF_OgmWOzzFn3e4pB3t3BzcDdpCdkR_w3471DTbP-T_QCkXYr6</recordid><startdate>20201101</startdate><enddate>20201101</enddate><creator>Derelle, Romain</creator><creator>Philippe, Hervé</creator><creator>Colbourne, John K</creator><general>Oxford University Press</general><general>Oxford University Press (OUP)</general><scope>TOX</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>1XC</scope><scope>VOOES</scope><orcidid>https://orcid.org/0000-0002-1335-8015</orcidid></search><sort><creationdate>20201101</creationdate><title>Broccoli: Combining Phylogenetic and Network Analyses for Orthology Assignment</title><author>Derelle, Romain ; Philippe, Hervé ; Colbourne, John K</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c403t-a2632c51dfd3efd0c83438375ec040716c2085e4d7ad04b2985797e65327e3b03</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Genomics - methods</topic><topic>Life Sciences</topic><topic>Mutant Chimeric Proteins</topic><topic>Phylogeny</topic><topic>Quantitative Methods</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Derelle, Romain</creatorcontrib><creatorcontrib>Philippe, Hervé</creatorcontrib><creatorcontrib>Colbourne, John K</creatorcontrib><collection>Open Access: Oxford University Press Open Journals</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Hyper Article en Ligne (HAL)</collection><collection>Hyper Article en Ligne (HAL) (Open Access)</collection><jtitle>Molecular biology and evolution</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Derelle, Romain</au><au>Philippe, Hervé</au><au>Colbourne, John K</au><au>Falush, Daniel</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Broccoli: Combining Phylogenetic and Network Analyses for Orthology Assignment</atitle><jtitle>Molecular biology and evolution</jtitle><addtitle>Mol Biol Evol</addtitle><date>2020-11-01</date><risdate>2020</risdate><volume>37</volume><issue>11</issue><spage>3389</spage><epage>3396</epage><pages>3389-3396</pages><issn>0737-4038</issn><eissn>1537-1719</eissn><abstract>Abstract
Orthology assignment is a key step of comparative genomic studies, for which many bioinformatic tools have been developed. However, all gene clustering pipelines are based on the analysis of protein distances, which are subject to many artifacts. In this article, we introduce Broccoli, a user-friendly pipeline designed to infer, with high precision, orthologous groups, and pairs of proteins using a phylogeny-based approach. Briefly, Broccoli performs ultrafast phylogenetic analyses on most proteins and builds a network of orthologous relationships. Orthologous groups are then identified from the network using a parameter-free machine learning algorithm. Broccoli is also able to detect chimeric proteins resulting from gene-fusion events and to assign these proteins to the corresponding orthologous groups. Tested on two benchmark data sets, Broccoli outperforms current orthology pipelines. In addition, Broccoli is scalable, with runtimes similar to those of recent distance-based pipelines. Given its high level of performance and efficiency, this new pipeline represents a suitable choice for comparative genomic studies. Broccoli is freely available at https://github.com/rderelle/Broccoli.</abstract><cop>United States</cop><pub>Oxford University Press</pub><pmid>32602888</pmid><doi>10.1093/molbev/msaa159</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-1335-8015</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0737-4038 |
ispartof | Molecular biology and evolution, 2020-11, Vol.37 (11), p.3389-3396 |
issn | 0737-4038 1537-1719 |
language | eng |
recordid | cdi_hal_primary_oai_HAL_hal_03100139v1 |
source | PubMed (Medline); Open Access: Oxford University Press Open Journals; Full-Text Journals in Chemistry (Open access) |
subjects | Genomics - methods Life Sciences Mutant Chimeric Proteins Phylogeny Quantitative Methods Software |
title | Broccoli: Combining Phylogenetic and Network Analyses for Orthology Assignment |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-12T00%3A23%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_hal_p&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Broccoli:%20Combining%20Phylogenetic%20and%20Network%20Analyses%20for%20Orthology%20Assignment&rft.jtitle=Molecular%20biology%20and%20evolution&rft.au=Derelle,%20Romain&rft.date=2020-11-01&rft.volume=37&rft.issue=11&rft.spage=3389&rft.epage=3396&rft.pages=3389-3396&rft.issn=0737-4038&rft.eissn=1537-1719&rft_id=info:doi/10.1093/molbev/msaa159&rft_dat=%3Cproquest_hal_p%3E2419087359%3C/proquest_hal_p%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c403t-a2632c51dfd3efd0c83438375ec040716c2085e4d7ad04b2985797e65327e3b03%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2419087359&rft_id=info:pmid/32602888&rft_oup_id=10.1093/molbev/msaa159&rfr_iscdi=true |