Loading…

Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes

Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon mo...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2021-03, Vol.16 (3), p.e0248337-e0248337
Main Authors: Lucaci, Alexander G, Wisotsky, Sadie R, Shank, Stephen D, Weaver, Steven, Kosakovsky Pond, Sergei L
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c640t-3d371008e7d7f0d528415485e11e610d47ce51a8ddd101bd7a44f174f1f6e7da3
cites cdi_FETCH-LOGICAL-c640t-3d371008e7d7f0d528415485e11e610d47ce51a8ddd101bd7a44f174f1f6e7da3
container_end_page e0248337
container_issue 3
container_start_page e0248337
container_title PloS one
container_volume 16
creator Lucaci, Alexander G
Wisotsky, Sadie R
Shank, Stephen D
Weaver, Steven
Kosakovsky Pond, Sergei L
description Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon models continue to posit that only single nucleotide change have non-zero rates. Here, we develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using over 42, 000 empirical alignments, we find widespread statistical support for multiple hits: 61% of alignments prefer models with 2H allowed, and 23%-with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misspecification or alignment errors. Further modeling reveals that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package and in the Datamonkey.org server.
doi_str_mv 10.1371/journal.pone.0248337
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_8c1c63c09e00483a8808e2e2339075da</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A654753544</galeid><doaj_id>oai_doaj_org_article_8c1c63c09e00483a8808e2e2339075da</doaj_id><sourcerecordid>A654753544</sourcerecordid><originalsourceid>FETCH-LOGICAL-c640t-3d371008e7d7f0d528415485e11e610d47ce51a8ddd101bd7a44f174f1f6e7da3</originalsourceid><addsrcrecordid>eNqNkl2L1DAYhYso7rr6D0QKguhFx6RJms5eCMuy6sDCgp8XXoRM8rbNkDY1SWX992a24zIFL6QpDelzTl9OT5Y9x2iFCcdvd27yg7Sr0Q2wQiWtCeEPslO8JmVRlYg8PNqfZE9C2CHESF1Vj7OThGKMODrNflzdRi_zrQyQdyaG8_y70RBGD1Ln0I_GGyVtHqZxdD7mjfO5GUKUQ1rgppD3k41mtFAMk7LgYlLnqpNDC-Fp9qiRNsCzw_Ms-_r-6svlx-L65sPm8uK6UBVFsSA6TYNQDVzzBmlW1hQzWjPAGCqMNOUKGJa11hojvNVcUtpgnu6mShpJzrLN7Kud3InRm17638JJI-4OnG-F9NGk8UStsKqIQmtAKCUm6zp9t4SSkDXi7M7r3ew1TtsetIIhxWMXpss3g-lE634JvmaUoDoZvD4YePdzghBFb4ICa-e8RMkQLhmvGE_oyxltZRrNDI1LjmqPi4uKUc4IozRRq39Q6dLQG5V-fmPS-ULwZiFITITb2MopBLH5_On_2ZtvS_bVEduBtLELzk7RuCEsQTqDyrsQPDT38WEk9t0Vh-6KfXfFobtJ9uI4-nvR37KSP7Oz64w</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2501257657</pqid></control><display><type>article</type><title>Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Lucaci, Alexander G ; Wisotsky, Sadie R ; Shank, Stephen D ; Weaver, Steven ; Kosakovsky Pond, Sergei L</creator><creatorcontrib>Lucaci, Alexander G ; Wisotsky, Sadie R ; Shank, Stephen D ; Weaver, Steven ; Kosakovsky Pond, Sergei L</creatorcontrib><description>Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon models continue to posit that only single nucleotide change have non-zero rates. Here, we develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using over 42, 000 empirical alignments, we find widespread statistical support for multiple hits: 61% of alignments prefer models with 2H allowed, and 23%-with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misspecification or alignment errors. Further modeling reveals that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package and in the Datamonkey.org server.</description><identifier>ISSN: 1932-6203</identifier><identifier>EISSN: 1932-6203</identifier><identifier>DOI: 10.1371/journal.pone.0248337</identifier><identifier>PMID: 33711070</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Analysis ; Biology and Life Sciences ; Codon ; Earth Sciences ; Evolution ; Physical Sciences ; Research and Analysis Methods ; Serine</subject><ispartof>PloS one, 2021-03, Vol.16 (3), p.e0248337-e0248337</ispartof><rights>COPYRIGHT 2021 Public Library of Science</rights><rights>2021 Lucaci et al 2021 Lucaci et al</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c640t-3d371008e7d7f0d528415485e11e610d47ce51a8ddd101bd7a44f174f1f6e7da3</citedby><cites>FETCH-LOGICAL-c640t-3d371008e7d7f0d528415485e11e610d47ce51a8ddd101bd7a44f174f1f6e7da3</cites><orcidid>0000-0002-4896-6088 ; 0000-0002-6931-7191</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7954308/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7954308/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27901,27902,36990,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33711070$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Lucaci, Alexander G</creatorcontrib><creatorcontrib>Wisotsky, Sadie R</creatorcontrib><creatorcontrib>Shank, Stephen D</creatorcontrib><creatorcontrib>Weaver, Steven</creatorcontrib><creatorcontrib>Kosakovsky Pond, Sergei L</creatorcontrib><title>Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes</title><title>PloS one</title><addtitle>PLoS One</addtitle><description>Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon models continue to posit that only single nucleotide change have non-zero rates. Here, we develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using over 42, 000 empirical alignments, we find widespread statistical support for multiple hits: 61% of alignments prefer models with 2H allowed, and 23%-with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misspecification or alignment errors. Further modeling reveals that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package and in the Datamonkey.org server.</description><subject>Analysis</subject><subject>Biology and Life Sciences</subject><subject>Codon</subject><subject>Earth Sciences</subject><subject>Evolution</subject><subject>Physical Sciences</subject><subject>Research and Analysis Methods</subject><subject>Serine</subject><issn>1932-6203</issn><issn>1932-6203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNqNkl2L1DAYhYso7rr6D0QKguhFx6RJms5eCMuy6sDCgp8XXoRM8rbNkDY1SWX992a24zIFL6QpDelzTl9OT5Y9x2iFCcdvd27yg7Sr0Q2wQiWtCeEPslO8JmVRlYg8PNqfZE9C2CHESF1Vj7OThGKMODrNflzdRi_zrQyQdyaG8_y70RBGD1Ln0I_GGyVtHqZxdD7mjfO5GUKUQ1rgppD3k41mtFAMk7LgYlLnqpNDC-Fp9qiRNsCzw_Ms-_r-6svlx-L65sPm8uK6UBVFsSA6TYNQDVzzBmlW1hQzWjPAGCqMNOUKGJa11hojvNVcUtpgnu6mShpJzrLN7Kud3InRm17638JJI-4OnG-F9NGk8UStsKqIQmtAKCUm6zp9t4SSkDXi7M7r3ew1TtsetIIhxWMXpss3g-lE634JvmaUoDoZvD4YePdzghBFb4ICa-e8RMkQLhmvGE_oyxltZRrNDI1LjmqPi4uKUc4IozRRq39Q6dLQG5V-fmPS-ULwZiFITITb2MopBLH5_On_2ZtvS_bVEduBtLELzk7RuCEsQTqDyrsQPDT38WEk9t0Vh-6KfXfFobtJ9uI4-nvR37KSP7Oz64w</recordid><startdate>20210312</startdate><enddate>20210312</enddate><creator>Lucaci, Alexander G</creator><creator>Wisotsky, Sadie R</creator><creator>Shank, Stephen D</creator><creator>Weaver, Steven</creator><creator>Kosakovsky Pond, Sergei L</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-4896-6088</orcidid><orcidid>https://orcid.org/0000-0002-6931-7191</orcidid></search><sort><creationdate>20210312</creationdate><title>Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes</title><author>Lucaci, Alexander G ; Wisotsky, Sadie R ; Shank, Stephen D ; Weaver, Steven ; Kosakovsky Pond, Sergei L</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c640t-3d371008e7d7f0d528415485e11e610d47ce51a8ddd101bd7a44f174f1f6e7da3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Analysis</topic><topic>Biology and Life Sciences</topic><topic>Codon</topic><topic>Earth Sciences</topic><topic>Evolution</topic><topic>Physical Sciences</topic><topic>Research and Analysis Methods</topic><topic>Serine</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lucaci, Alexander G</creatorcontrib><creatorcontrib>Wisotsky, Sadie R</creatorcontrib><creatorcontrib>Shank, Stephen D</creatorcontrib><creatorcontrib>Weaver, Steven</creatorcontrib><creatorcontrib>Kosakovsky Pond, Sergei L</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>Opposing Viewpoints In Context</collection><collection>Science in Context</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PloS one</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lucaci, Alexander G</au><au>Wisotsky, Sadie R</au><au>Shank, Stephen D</au><au>Weaver, Steven</au><au>Kosakovsky Pond, Sergei L</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes</atitle><jtitle>PloS one</jtitle><addtitle>PLoS One</addtitle><date>2021-03-12</date><risdate>2021</risdate><volume>16</volume><issue>3</issue><spage>e0248337</spage><epage>e0248337</epage><pages>e0248337-e0248337</pages><issn>1932-6203</issn><eissn>1932-6203</eissn><abstract>Despite many attempts to introduce evolutionary models that permit substitutions to instantly alter more than one nucleotide in a codon, the prevailing wisdom remains that such changes are rare and generally negligible or are reflective of non-biological artifacts, such as alignment errors. Codon models continue to posit that only single nucleotide change have non-zero rates. Here, we develop and test a simple hierarchy of codon-substitution models with non-zero evolutionary rates for only one-nucleotide (1H), one- and two-nucleotide (2H), or any (3H) codon substitutions. Using over 42, 000 empirical alignments, we find widespread statistical support for multiple hits: 61% of alignments prefer models with 2H allowed, and 23%-with 3H allowed. Analyses of simulated data suggest that these results are not likely to be due to simple artifacts such as model misspecification or alignment errors. Further modeling reveals that synonymous codon island jumping among codons encoding serine, especially along short branches, contributes significantly to this 3H signal. While serine codons were prominently involved in multiple-hit substitutions, there were other common exchanges contributing to better model fit. It appears that a small subset of sites in most alignments have unusual evolutionary dynamics not well explained by existing model formalisms, and that commonly estimated quantities, such as dN/dS ratios may be biased by model misspecification. Our findings highlight the need for continued evaluation of assumptions underlying workhorse evolutionary models and subsequent evolutionary inference techniques. We provide a software implementation for evolutionary biologists to assess the potential impact of extra base hits in their data in the HyPhy package and in the Datamonkey.org server.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>33711070</pmid><doi>10.1371/journal.pone.0248337</doi><tpages>e0248337</tpages><orcidid>https://orcid.org/0000-0002-4896-6088</orcidid><orcidid>https://orcid.org/0000-0002-6931-7191</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1932-6203
ispartof PloS one, 2021-03, Vol.16 (3), p.e0248337-e0248337
issn 1932-6203
1932-6203
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_8c1c63c09e00483a8808e2e2339075da
source Publicly Available Content Database; PubMed Central
subjects Analysis
Biology and Life Sciences
Codon
Earth Sciences
Evolution
Physical Sciences
Research and Analysis Methods
Serine
title Extra base hits: Widespread empirical support for instantaneous multiple-nucleotide changes
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T11%3A16%3A12IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Extra%20base%20hits:%20Widespread%20empirical%20support%20for%20instantaneous%20multiple-nucleotide%20changes&rft.jtitle=PloS%20one&rft.au=Lucaci,%20Alexander%20G&rft.date=2021-03-12&rft.volume=16&rft.issue=3&rft.spage=e0248337&rft.epage=e0248337&rft.pages=e0248337-e0248337&rft.issn=1932-6203&rft.eissn=1932-6203&rft_id=info:doi/10.1371/journal.pone.0248337&rft_dat=%3Cgale_doaj_%3EA654753544%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c640t-3d371008e7d7f0d528415485e11e610d47ce51a8ddd101bd7a44f174f1f6e7da3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2501257657&rft_id=info:pmid/33711070&rft_galeid=A654753544&rfr_iscdi=true