Loading…

Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data

The multispecies network coalescent (MSNC) is a stochastic process that captures how gene trees grow within the branches of a phylogenetic network. Coupling the MSNC with a stochastic mutational process that operates along the branches of the gene trees gives rise to a generative model of how multip...

Full description

Saved in:
Bibliographic Details
Published in:Systematic biology 2018-05, Vol.67 (3), p.439-457
Main Authors: Wen, Dingqiao, Nakhleh, Luay
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c387t-29b13adc58413be96aac76e5779e0533bc54f69ace32961efc1b887dcb291f173
cites cdi_FETCH-LOGICAL-c387t-29b13adc58413be96aac76e5779e0533bc54f69ace32961efc1b887dcb291f173
container_end_page 457
container_issue 3
container_start_page 439
container_title Systematic biology
container_volume 67
creator Wen, Dingqiao
Nakhleh, Luay
description The multispecies network coalescent (MSNC) is a stochastic process that captures how gene trees grow within the branches of a phylogenetic network. Coupling the MSNC with a stochastic mutational process that operates along the branches of the gene trees gives rise to a generative model of how multiple loci from within and across species evolve in the presence of both incomplete lineage sorting (ILS) and reticulation (e.g., hybridization). We report on a Bayesian method for sampling the parameters of this generative model, including the species phylogeny, gene trees, divergence times, and population sizes, from DNA sequences of multiple independent loci. We demonstrate the utility of our method by analyzing simulated data and reanalyzing an empirical data set. Our results demonstrate the significance of not only coestimating species phylogenies and gene trees, but also accounting for reticulation and ILS simultaneously. In particular, we show that when gene flow occurs, our method accurately estimates the evolutionary histories, coalescence times, and divergence times. Tree inference methods, on the other hand, underestimate divergence times and overestimate coalescence times when the evolutionary history is reticulate. While the MSNC corresponds to an abstract model of “intermixture,” we study the performance of the model and method on simulated data generated under a gene flow model. We show that the method accurately infers the most recent time at which gene flow occurs. Finally, we demonstrate the application of the new method to a 106-locus yeast data set.
doi_str_mv 10.1093/sysbio/syx085
format article
fullrecord <record><control><sourceid>jstor_proqu</sourceid><recordid>TN_cdi_proquest_miscellaneous_1958545141</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><jstor_id>26581969</jstor_id><oup_id>10.1093/sysbio/syx085</oup_id><sourcerecordid>26581969</sourcerecordid><originalsourceid>FETCH-LOGICAL-c387t-29b13adc58413be96aac76e5779e0533bc54f69ace32961efc1b887dcb291f173</originalsourceid><addsrcrecordid>eNqFkMFLwzAUh4Mobk6PHpUevVTzmiZNjjJ1ChOHTvBW0vR1dnTNbFJw_70dne7o6b0HH9-P9yPkHOg1UMVu3MZlpe3GN5X8gAyBJiKUTHwcbnfBQg48GZAT55aUAggOx2QQKSplTNWQzMYWnS9X2pf1InhFX5q20h6D2eemsgusS3SBrvNggjUG8wa7s2jsKnhuK19W1rQueMOvFmuDwZ32-pQcFbpyeLabI_L-cD8fP4bTl8nT-HYaGiYTH0YqA6Zzw2UMLEMltDaJQJ4kCilnLDM8LoTSBlmkBGBhIJMyyU0WKSggYSNy1XvXje3inU9XpTNYVbpG27oUFJc85tDpRyTsUdNY5xos0nXTfdxsUqDptsS0LzHtS-z4y526zVaY_9G_re2zbbv-13XRo0vnbbNXCS5BCcV-AFKtiCM</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1958545141</pqid></control><display><type>article</type><title>Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data</title><source>JSTOR Archival Journals and Primary Sources Collection</source><source>Oxford Journals Online</source><creator>Wen, Dingqiao ; Nakhleh, Luay</creator><contributor>Kubatko, Laura</contributor><creatorcontrib>Wen, Dingqiao ; Nakhleh, Luay ; Kubatko, Laura</creatorcontrib><description>The multispecies network coalescent (MSNC) is a stochastic process that captures how gene trees grow within the branches of a phylogenetic network. Coupling the MSNC with a stochastic mutational process that operates along the branches of the gene trees gives rise to a generative model of how multiple loci from within and across species evolve in the presence of both incomplete lineage sorting (ILS) and reticulation (e.g., hybridization). We report on a Bayesian method for sampling the parameters of this generative model, including the species phylogeny, gene trees, divergence times, and population sizes, from DNA sequences of multiple independent loci. We demonstrate the utility of our method by analyzing simulated data and reanalyzing an empirical data set. Our results demonstrate the significance of not only coestimating species phylogenies and gene trees, but also accounting for reticulation and ILS simultaneously. In particular, we show that when gene flow occurs, our method accurately estimates the evolutionary histories, coalescence times, and divergence times. Tree inference methods, on the other hand, underestimate divergence times and overestimate coalescence times when the evolutionary history is reticulate. While the MSNC corresponds to an abstract model of “intermixture,” we study the performance of the model and method on simulated data generated under a gene flow model. We show that the method accurately infers the most recent time at which gene flow occurs. Finally, we demonstrate the application of the new method to a 106-locus yeast data set.</description><identifier>ISSN: 1063-5157</identifier><identifier>EISSN: 1076-836X</identifier><identifier>DOI: 10.1093/sysbio/syx085</identifier><identifier>PMID: 29088409</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Bayes Theorem ; Computer Simulation ; Gene Flow ; Genetic Speciation ; Models, Genetic ; Phylogeny ; REGULAR ARTICLES ; Saccharomyces cerevisiae - classification ; Saccharomyces cerevisiae - genetics</subject><ispartof>Systematic biology, 2018-05, Vol.67 (3), p.439-457</ispartof><rights>The Author(s) 2017</rights><rights>The Author(s) 2017. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com 2017</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c387t-29b13adc58413be96aac76e5779e0533bc54f69ace32961efc1b887dcb291f173</citedby><cites>FETCH-LOGICAL-c387t-29b13adc58413be96aac76e5779e0533bc54f69ace32961efc1b887dcb291f173</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.jstor.org/stable/pdf/26581969$$EPDF$$P50$$Gjstor$$H</linktopdf><linktohtml>$$Uhttps://www.jstor.org/stable/26581969$$EHTML$$P50$$Gjstor$$H</linktohtml><link.rule.ids>314,777,781,27905,27906,58219,58452</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/29088409$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Kubatko, Laura</contributor><creatorcontrib>Wen, Dingqiao</creatorcontrib><creatorcontrib>Nakhleh, Luay</creatorcontrib><title>Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data</title><title>Systematic biology</title><addtitle>Syst Biol</addtitle><description>The multispecies network coalescent (MSNC) is a stochastic process that captures how gene trees grow within the branches of a phylogenetic network. Coupling the MSNC with a stochastic mutational process that operates along the branches of the gene trees gives rise to a generative model of how multiple loci from within and across species evolve in the presence of both incomplete lineage sorting (ILS) and reticulation (e.g., hybridization). We report on a Bayesian method for sampling the parameters of this generative model, including the species phylogeny, gene trees, divergence times, and population sizes, from DNA sequences of multiple independent loci. We demonstrate the utility of our method by analyzing simulated data and reanalyzing an empirical data set. Our results demonstrate the significance of not only coestimating species phylogenies and gene trees, but also accounting for reticulation and ILS simultaneously. In particular, we show that when gene flow occurs, our method accurately estimates the evolutionary histories, coalescence times, and divergence times. Tree inference methods, on the other hand, underestimate divergence times and overestimate coalescence times when the evolutionary history is reticulate. While the MSNC corresponds to an abstract model of “intermixture,” we study the performance of the model and method on simulated data generated under a gene flow model. We show that the method accurately infers the most recent time at which gene flow occurs. Finally, we demonstrate the application of the new method to a 106-locus yeast data set.</description><subject>Bayes Theorem</subject><subject>Computer Simulation</subject><subject>Gene Flow</subject><subject>Genetic Speciation</subject><subject>Models, Genetic</subject><subject>Phylogeny</subject><subject>REGULAR ARTICLES</subject><subject>Saccharomyces cerevisiae - classification</subject><subject>Saccharomyces cerevisiae - genetics</subject><issn>1063-5157</issn><issn>1076-836X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><recordid>eNqFkMFLwzAUh4Mobk6PHpUevVTzmiZNjjJ1ChOHTvBW0vR1dnTNbFJw_70dne7o6b0HH9-P9yPkHOg1UMVu3MZlpe3GN5X8gAyBJiKUTHwcbnfBQg48GZAT55aUAggOx2QQKSplTNWQzMYWnS9X2pf1InhFX5q20h6D2eemsgusS3SBrvNggjUG8wa7s2jsKnhuK19W1rQueMOvFmuDwZ32-pQcFbpyeLabI_L-cD8fP4bTl8nT-HYaGiYTH0YqA6Zzw2UMLEMltDaJQJ4kCilnLDM8LoTSBlmkBGBhIJMyyU0WKSggYSNy1XvXje3inU9XpTNYVbpG27oUFJc85tDpRyTsUdNY5xos0nXTfdxsUqDptsS0LzHtS-z4y526zVaY_9G_re2zbbv-13XRo0vnbbNXCS5BCcV-AFKtiCM</recordid><startdate>20180501</startdate><enddate>20180501</enddate><creator>Wen, Dingqiao</creator><creator>Nakhleh, Luay</creator><general>Oxford University Press</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20180501</creationdate><title>Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data</title><author>Wen, Dingqiao ; Nakhleh, Luay</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c387t-29b13adc58413be96aac76e5779e0533bc54f69ace32961efc1b887dcb291f173</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Bayes Theorem</topic><topic>Computer Simulation</topic><topic>Gene Flow</topic><topic>Genetic Speciation</topic><topic>Models, Genetic</topic><topic>Phylogeny</topic><topic>REGULAR ARTICLES</topic><topic>Saccharomyces cerevisiae - classification</topic><topic>Saccharomyces cerevisiae - genetics</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wen, Dingqiao</creatorcontrib><creatorcontrib>Nakhleh, Luay</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Systematic biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wen, Dingqiao</au><au>Nakhleh, Luay</au><au>Kubatko, Laura</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data</atitle><jtitle>Systematic biology</jtitle><addtitle>Syst Biol</addtitle><date>2018-05-01</date><risdate>2018</risdate><volume>67</volume><issue>3</issue><spage>439</spage><epage>457</epage><pages>439-457</pages><issn>1063-5157</issn><eissn>1076-836X</eissn><abstract>The multispecies network coalescent (MSNC) is a stochastic process that captures how gene trees grow within the branches of a phylogenetic network. Coupling the MSNC with a stochastic mutational process that operates along the branches of the gene trees gives rise to a generative model of how multiple loci from within and across species evolve in the presence of both incomplete lineage sorting (ILS) and reticulation (e.g., hybridization). We report on a Bayesian method for sampling the parameters of this generative model, including the species phylogeny, gene trees, divergence times, and population sizes, from DNA sequences of multiple independent loci. We demonstrate the utility of our method by analyzing simulated data and reanalyzing an empirical data set. Our results demonstrate the significance of not only coestimating species phylogenies and gene trees, but also accounting for reticulation and ILS simultaneously. In particular, we show that when gene flow occurs, our method accurately estimates the evolutionary histories, coalescence times, and divergence times. Tree inference methods, on the other hand, underestimate divergence times and overestimate coalescence times when the evolutionary history is reticulate. While the MSNC corresponds to an abstract model of “intermixture,” we study the performance of the model and method on simulated data generated under a gene flow model. We show that the method accurately infers the most recent time at which gene flow occurs. Finally, we demonstrate the application of the new method to a 106-locus yeast data set.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>29088409</pmid><doi>10.1093/sysbio/syx085</doi><tpages>19</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1063-5157
ispartof Systematic biology, 2018-05, Vol.67 (3), p.439-457
issn 1063-5157
1076-836X
language eng
recordid cdi_proquest_miscellaneous_1958545141
source JSTOR Archival Journals and Primary Sources Collection; Oxford Journals Online
subjects Bayes Theorem
Computer Simulation
Gene Flow
Genetic Speciation
Models, Genetic
Phylogeny
REGULAR ARTICLES
Saccharomyces cerevisiae - classification
Saccharomyces cerevisiae - genetics
title Coestimating Reticulate Phylogenies and Gene Trees from Multilocus Sequence Data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-18T12%3A03%3A57IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-jstor_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Coestimating%20Reticulate%20Phylogenies%20and%20Gene%20Trees%20from%20Multilocus%20Sequence%20Data&rft.jtitle=Systematic%20biology&rft.au=Wen,%20Dingqiao&rft.date=2018-05-01&rft.volume=67&rft.issue=3&rft.spage=439&rft.epage=457&rft.pages=439-457&rft.issn=1063-5157&rft.eissn=1076-836X&rft_id=info:doi/10.1093/sysbio/syx085&rft_dat=%3Cjstor_proqu%3E26581969%3C/jstor_proqu%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c387t-29b13adc58413be96aac76e5779e0533bc54f69ace32961efc1b887dcb291f173%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1958545141&rft_id=info:pmid/29088409&rft_jstor_id=26581969&rft_oup_id=10.1093/sysbio/syx085&rfr_iscdi=true