Loading…
Timesweeper: accurately identifying selective sweeps using population genomic time series
Abstract Despite decades of research, identifying selective sweeps, the genomic footprints of positive selection, remains a core problem in population genetics. Of the myriad methods that have been developed to tackle this task, few are designed to leverage the potential of genomic time-series data....
Saved in:
Published in: | Genetics (Austin) 2023-07, Vol.224 (3) |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c461t-730375d7d9f9465669c2c8564c18d218dc8a2ec2297e510e6b2a1feba9bffb73 |
---|---|
cites | cdi_FETCH-LOGICAL-c461t-730375d7d9f9465669c2c8564c18d218dc8a2ec2297e510e6b2a1feba9bffb73 |
container_end_page | |
container_issue | 3 |
container_start_page | |
container_title | Genetics (Austin) |
container_volume | 224 |
creator | Whitehouse, Logan S Schrider, Daniel R |
description | Abstract
Despite decades of research, identifying selective sweeps, the genomic footprints of positive selection, remains a core problem in population genetics. Of the myriad methods that have been developed to tackle this task, few are designed to leverage the potential of genomic time-series data. This is because in most population genetic studies of natural populations, only a single period of time can be sampled. Recent advancements in sequencing technology, including improvements in extracting and sequencing ancient DNA, have made repeated samplings of a population possible, allowing for more direct analysis of recent evolutionary dynamics. Serial sampling of organisms with shorter generation times has also become more feasible due to improvements in the cost and throughput of sequencing. With these advances in mind, here we present Timesweeper, a fast and accurate convolutional neural network-based tool for identifying selective sweeps in data consisting of multiple genomic samplings of a population over time. Timesweeper analyzes population genomic time-series data by first simulating training data under a demographic model appropriate for the data of interest, training a one-dimensional convolutional neural network on said simulations, and inferring which polymorphisms in this serialized data set were the direct target of a completed or ongoing selective sweep. We show that Timesweeper is accurate under multiple simulated demographic and sampling scenarios, identifies selected variants with high resolution, and estimates selection coefficients more accurately than existing methods. In sum, we show that more accurate inferences about natural selection are possible when genomic time-series data are available; such data will continue to proliferate in coming years due to both the sequencing of ancient samples and repeated samplings of extant populations with faster generation times, as well as experimentally evolved populations where time-series data are often generated. Methodological advances such as Timesweeper thus have the potential to help resolve the controversy over the role of positive selection in the genome. We provide Timesweeper as a Python package for use by the community.
Despite decades of research, detecting genomic loci responsible for adaptation remains challenging; however, improvements in DNA sequencing make it possible to measure genomic variation across multiple timepoints. Such data could aid in detecting selective sweeps—in wh |
doi_str_mv | 10.1093/genetics/iyad084 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10324941</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><oup_id>10.1093/genetics/iyad084</oup_id><sourcerecordid>3050377257</sourcerecordid><originalsourceid>FETCH-LOGICAL-c461t-730375d7d9f9465669c2c8564c18d218dc8a2ec2297e510e6b2a1feba9bffb73</originalsourceid><addsrcrecordid>eNqFkUtLxDAUhYMozvjYu5KCG0FG82jSxo2I-ALBzWxchTS9HTO0TU1aZf69GWeU0Y2LkHDz3cM5HISOCD4nWLKLGbTQWxMu7EKXOE-30JjIlE2oYGR74z1CeyHMMcZC8nwXjVhGeCZJOkYvU9tA-ADowF8m2pjB6x7qRWJLaHtbLWw7SwLUYHr7DskXGZIhLMed64Za99a1STTiGmuSPqpF3FsIB2in0nWAw_W9j6Z3t9Obh8nT8_3jzfXTxKSC9JOMYZbxMitlJVPBhZCGmpyL1JC8pPGYXFMwlMoMOMEgCqpJBYWWRVUVGdtHVyvZbigaKE107XWtOm8b7RfKaat-_7T2Vc3cuyKY0VSmJCqcrhW8exsg9KqxwUBd6xbcEBTNCeEij2RET_6gczf4NsZTDPMYJKN8aQmvKONdCB6qHzcEq2Vv6rs3te4trhxvpvhZ-C4qAmcrwA3d_3KfpXmo1g</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3050377257</pqid></control><display><type>article</type><title>Timesweeper: accurately identifying selective sweeps using population genomic time series</title><source>Freely Accessible Science Journals</source><source>Oxford Journals Online</source><source>Alma/SFX Local Collection</source><creator>Whitehouse, Logan S ; Schrider, Daniel R</creator><contributor>Coop, G</contributor><creatorcontrib>Whitehouse, Logan S ; Schrider, Daniel R ; Coop, G</creatorcontrib><description>Abstract
Despite decades of research, identifying selective sweeps, the genomic footprints of positive selection, remains a core problem in population genetics. Of the myriad methods that have been developed to tackle this task, few are designed to leverage the potential of genomic time-series data. This is because in most population genetic studies of natural populations, only a single period of time can be sampled. Recent advancements in sequencing technology, including improvements in extracting and sequencing ancient DNA, have made repeated samplings of a population possible, allowing for more direct analysis of recent evolutionary dynamics. Serial sampling of organisms with shorter generation times has also become more feasible due to improvements in the cost and throughput of sequencing. With these advances in mind, here we present Timesweeper, a fast and accurate convolutional neural network-based tool for identifying selective sweeps in data consisting of multiple genomic samplings of a population over time. Timesweeper analyzes population genomic time-series data by first simulating training data under a demographic model appropriate for the data of interest, training a one-dimensional convolutional neural network on said simulations, and inferring which polymorphisms in this serialized data set were the direct target of a completed or ongoing selective sweep. We show that Timesweeper is accurate under multiple simulated demographic and sampling scenarios, identifies selected variants with high resolution, and estimates selection coefficients more accurately than existing methods. In sum, we show that more accurate inferences about natural selection are possible when genomic time-series data are available; such data will continue to proliferate in coming years due to both the sequencing of ancient samples and repeated samplings of extant populations with faster generation times, as well as experimentally evolved populations where time-series data are often generated. Methodological advances such as Timesweeper thus have the potential to help resolve the controversy over the role of positive selection in the genome. We provide Timesweeper as a Python package for use by the community.
Despite decades of research, detecting genomic loci responsible for adaptation remains challenging; however, improvements in DNA sequencing make it possible to measure genomic variation across multiple timepoints. Such data could aid in detecting selective sweeps—in which an adaptive mutation rapidly increases in frequency—thereby revealing loci responding to natural selection. Whitehouse and Schrider present a machine learning method called Timesweeper that accurately detects sweeps from time-series data—including unphased or un-genotyped data, provided allele frequency estimates are obtainable.</description><identifier>ISSN: 1943-2631</identifier><identifier>ISSN: 0016-6731</identifier><identifier>EISSN: 1943-2631</identifier><identifier>DOI: 10.1093/genetics/iyad084</identifier><identifier>PMID: 37157914</identifier><language>eng</language><publisher>US: Oxford University Press</publisher><subject>Artificial neural networks ; Demographics ; Demography ; DNA sequencing ; Genetics ; Genetics, Population ; Genomics ; Investigation ; Metagenomics ; Natural populations ; Natural selection ; Neural networks ; Polymorphism, Genetic ; Population ; Population genetics ; Population studies ; Populations ; Positive selection ; Sampling ; Selection, Genetic ; Simulation ; Time Factors ; Time series</subject><ispartof>Genetics (Austin), 2023-07, Vol.224 (3)</ispartof><rights>The Author(s) 2023. Published by Oxford University Press on behalf of The Genetics Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com 2023</rights><rights>The Author(s) 2023. Published by Oxford University Press on behalf of The Genetics Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.</rights><rights>The Author(s) 2023. Published by Oxford University Press on behalf of The Genetics Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c461t-730375d7d9f9465669c2c8564c18d218dc8a2ec2297e510e6b2a1feba9bffb73</citedby><cites>FETCH-LOGICAL-c461t-730375d7d9f9465669c2c8564c18d218dc8a2ec2297e510e6b2a1feba9bffb73</cites><orcidid>0000-0001-5249-4151</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,780,784,885,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/37157914$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Coop, G</contributor><creatorcontrib>Whitehouse, Logan S</creatorcontrib><creatorcontrib>Schrider, Daniel R</creatorcontrib><title>Timesweeper: accurately identifying selective sweeps using population genomic time series</title><title>Genetics (Austin)</title><addtitle>Genetics</addtitle><description>Abstract
Despite decades of research, identifying selective sweeps, the genomic footprints of positive selection, remains a core problem in population genetics. Of the myriad methods that have been developed to tackle this task, few are designed to leverage the potential of genomic time-series data. This is because in most population genetic studies of natural populations, only a single period of time can be sampled. Recent advancements in sequencing technology, including improvements in extracting and sequencing ancient DNA, have made repeated samplings of a population possible, allowing for more direct analysis of recent evolutionary dynamics. Serial sampling of organisms with shorter generation times has also become more feasible due to improvements in the cost and throughput of sequencing. With these advances in mind, here we present Timesweeper, a fast and accurate convolutional neural network-based tool for identifying selective sweeps in data consisting of multiple genomic samplings of a population over time. Timesweeper analyzes population genomic time-series data by first simulating training data under a demographic model appropriate for the data of interest, training a one-dimensional convolutional neural network on said simulations, and inferring which polymorphisms in this serialized data set were the direct target of a completed or ongoing selective sweep. We show that Timesweeper is accurate under multiple simulated demographic and sampling scenarios, identifies selected variants with high resolution, and estimates selection coefficients more accurately than existing methods. In sum, we show that more accurate inferences about natural selection are possible when genomic time-series data are available; such data will continue to proliferate in coming years due to both the sequencing of ancient samples and repeated samplings of extant populations with faster generation times, as well as experimentally evolved populations where time-series data are often generated. Methodological advances such as Timesweeper thus have the potential to help resolve the controversy over the role of positive selection in the genome. We provide Timesweeper as a Python package for use by the community.
Despite decades of research, detecting genomic loci responsible for adaptation remains challenging; however, improvements in DNA sequencing make it possible to measure genomic variation across multiple timepoints. Such data could aid in detecting selective sweeps—in which an adaptive mutation rapidly increases in frequency—thereby revealing loci responding to natural selection. Whitehouse and Schrider present a machine learning method called Timesweeper that accurately detects sweeps from time-series data—including unphased or un-genotyped data, provided allele frequency estimates are obtainable.</description><subject>Artificial neural networks</subject><subject>Demographics</subject><subject>Demography</subject><subject>DNA sequencing</subject><subject>Genetics</subject><subject>Genetics, Population</subject><subject>Genomics</subject><subject>Investigation</subject><subject>Metagenomics</subject><subject>Natural populations</subject><subject>Natural selection</subject><subject>Neural networks</subject><subject>Polymorphism, Genetic</subject><subject>Population</subject><subject>Population genetics</subject><subject>Population studies</subject><subject>Populations</subject><subject>Positive selection</subject><subject>Sampling</subject><subject>Selection, Genetic</subject><subject>Simulation</subject><subject>Time Factors</subject><subject>Time series</subject><issn>1943-2631</issn><issn>0016-6731</issn><issn>1943-2631</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><recordid>eNqFkUtLxDAUhYMozvjYu5KCG0FG82jSxo2I-ALBzWxchTS9HTO0TU1aZf69GWeU0Y2LkHDz3cM5HISOCD4nWLKLGbTQWxMu7EKXOE-30JjIlE2oYGR74z1CeyHMMcZC8nwXjVhGeCZJOkYvU9tA-ADowF8m2pjB6x7qRWJLaHtbLWw7SwLUYHr7DskXGZIhLMed64Za99a1STTiGmuSPqpF3FsIB2in0nWAw_W9j6Z3t9Obh8nT8_3jzfXTxKSC9JOMYZbxMitlJVPBhZCGmpyL1JC8pPGYXFMwlMoMOMEgCqpJBYWWRVUVGdtHVyvZbigaKE107XWtOm8b7RfKaat-_7T2Vc3cuyKY0VSmJCqcrhW8exsg9KqxwUBd6xbcEBTNCeEij2RET_6gczf4NsZTDPMYJKN8aQmvKONdCB6qHzcEq2Vv6rs3te4trhxvpvhZ-C4qAmcrwA3d_3KfpXmo1g</recordid><startdate>20230706</startdate><enddate>20230706</enddate><creator>Whitehouse, Logan S</creator><creator>Schrider, Daniel R</creator><general>Oxford University Press</general><general>Genetics Society of America</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>4T-</scope><scope>4U-</scope><scope>7QP</scope><scope>7SS</scope><scope>7TK</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>K9.</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-5249-4151</orcidid></search><sort><creationdate>20230706</creationdate><title>Timesweeper: accurately identifying selective sweeps using population genomic time series</title><author>Whitehouse, Logan S ; Schrider, Daniel R</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c461t-730375d7d9f9465669c2c8564c18d218dc8a2ec2297e510e6b2a1feba9bffb73</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Artificial neural networks</topic><topic>Demographics</topic><topic>Demography</topic><topic>DNA sequencing</topic><topic>Genetics</topic><topic>Genetics, Population</topic><topic>Genomics</topic><topic>Investigation</topic><topic>Metagenomics</topic><topic>Natural populations</topic><topic>Natural selection</topic><topic>Neural networks</topic><topic>Polymorphism, Genetic</topic><topic>Population</topic><topic>Population genetics</topic><topic>Population studies</topic><topic>Populations</topic><topic>Positive selection</topic><topic>Sampling</topic><topic>Selection, Genetic</topic><topic>Simulation</topic><topic>Time Factors</topic><topic>Time series</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Whitehouse, Logan S</creatorcontrib><creatorcontrib>Schrider, Daniel R</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Docstoc</collection><collection>University Readers</collection><collection>Calcium & Calcified Tissue Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Neurosciences Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Genetics (Austin)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Whitehouse, Logan S</au><au>Schrider, Daniel R</au><au>Coop, G</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Timesweeper: accurately identifying selective sweeps using population genomic time series</atitle><jtitle>Genetics (Austin)</jtitle><addtitle>Genetics</addtitle><date>2023-07-06</date><risdate>2023</risdate><volume>224</volume><issue>3</issue><issn>1943-2631</issn><issn>0016-6731</issn><eissn>1943-2631</eissn><abstract>Abstract
Despite decades of research, identifying selective sweeps, the genomic footprints of positive selection, remains a core problem in population genetics. Of the myriad methods that have been developed to tackle this task, few are designed to leverage the potential of genomic time-series data. This is because in most population genetic studies of natural populations, only a single period of time can be sampled. Recent advancements in sequencing technology, including improvements in extracting and sequencing ancient DNA, have made repeated samplings of a population possible, allowing for more direct analysis of recent evolutionary dynamics. Serial sampling of organisms with shorter generation times has also become more feasible due to improvements in the cost and throughput of sequencing. With these advances in mind, here we present Timesweeper, a fast and accurate convolutional neural network-based tool for identifying selective sweeps in data consisting of multiple genomic samplings of a population over time. Timesweeper analyzes population genomic time-series data by first simulating training data under a demographic model appropriate for the data of interest, training a one-dimensional convolutional neural network on said simulations, and inferring which polymorphisms in this serialized data set were the direct target of a completed or ongoing selective sweep. We show that Timesweeper is accurate under multiple simulated demographic and sampling scenarios, identifies selected variants with high resolution, and estimates selection coefficients more accurately than existing methods. In sum, we show that more accurate inferences about natural selection are possible when genomic time-series data are available; such data will continue to proliferate in coming years due to both the sequencing of ancient samples and repeated samplings of extant populations with faster generation times, as well as experimentally evolved populations where time-series data are often generated. Methodological advances such as Timesweeper thus have the potential to help resolve the controversy over the role of positive selection in the genome. We provide Timesweeper as a Python package for use by the community.
Despite decades of research, detecting genomic loci responsible for adaptation remains challenging; however, improvements in DNA sequencing make it possible to measure genomic variation across multiple timepoints. Such data could aid in detecting selective sweeps—in which an adaptive mutation rapidly increases in frequency—thereby revealing loci responding to natural selection. Whitehouse and Schrider present a machine learning method called Timesweeper that accurately detects sweeps from time-series data—including unphased or un-genotyped data, provided allele frequency estimates are obtainable.</abstract><cop>US</cop><pub>Oxford University Press</pub><pmid>37157914</pmid><doi>10.1093/genetics/iyad084</doi><orcidid>https://orcid.org/0000-0001-5249-4151</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1943-2631 |
ispartof | Genetics (Austin), 2023-07, Vol.224 (3) |
issn | 1943-2631 0016-6731 1943-2631 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_10324941 |
source | Freely Accessible Science Journals; Oxford Journals Online; Alma/SFX Local Collection |
subjects | Artificial neural networks Demographics Demography DNA sequencing Genetics Genetics, Population Genomics Investigation Metagenomics Natural populations Natural selection Neural networks Polymorphism, Genetic Population Population genetics Population studies Populations Positive selection Sampling Selection, Genetic Simulation Time Factors Time series |
title | Timesweeper: accurately identifying selective sweeps using population genomic time series |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-26T09%3A53%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Timesweeper:%20accurately%20identifying%20selective%20sweeps%20using%20population%20genomic%20time%20series&rft.jtitle=Genetics%20(Austin)&rft.au=Whitehouse,%20Logan%20S&rft.date=2023-07-06&rft.volume=224&rft.issue=3&rft.issn=1943-2631&rft.eissn=1943-2631&rft_id=info:doi/10.1093/genetics/iyad084&rft_dat=%3Cproquest_pubme%3E3050377257%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c461t-730375d7d9f9465669c2c8564c18d218dc8a2ec2297e510e6b2a1feba9bffb73%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3050377257&rft_id=info:pmid/37157914&rft_oup_id=10.1093/genetics/iyad084&rfr_iscdi=true |