Loading…
ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language [version 1; peer review: 1 approved with reservations]
A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Metho...
Saved in:
Published in: | F1000 research 2022, Vol.11, p.126 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c1491-864927baafef2aefbd0042032401270f0af10cba1d0bc7c1e413370942cf88933 |
container_end_page | |
container_issue | |
container_start_page | 126 |
container_title | F1000 research |
container_volume | 11 |
creator | Mas-Sandoval, Alex Jin, Chenyu Fracassetti, Marco Fumagalli, Matteo |
description | A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model
organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read
sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicabil-
ity of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia |
doi_str_mv | 10.12688/f1000research.104368.1 |
format | article |
fullrecord | <record><control><sourceid>faculty1000_cross</sourceid><recordid>TN_cdi_crossref_primary_10_12688_f1000research_104368_1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_12688_f1000research_104368_1</sourcerecordid><originalsourceid>FETCH-LOGICAL-c1491-864927baafef2aefbd0042032401270f0af10cba1d0bc7c1e413370942cf88933</originalsourceid><addsrcrecordid>eNp9kMtOwzAQRS0EElXpN-AfSBk7Vh7tqipvVbCBFULRxBmnQSEJdtLSL-F3aRoWsGE1I809V5rD2LmAqZBBFF0YAQCWHKHV66kA5QfRVByxkQQVeEKBPP61n7KJc297AuLYD2Q4Yl9V7u67ssAZb-qmK7Et6ornVFFbaI4VljtXOF4bXtFn6_UHO2QuHxbc0UdHlS6qnGfYIt8W7Zof6niJVd5hTvxlQ9b1gJjzhshyS5uCtjMuODaNrTeUDVz_hd0cyt3rGTsxWDqa_Mwxe76-elreeqvHm7vlYuVpoWLhRYGKZZgiGjISyaQZgJLgSwVChmAA94J0iiKDVIdakBK-H0KspDZRFPv-mIVDr7a1c5ZM0tjiHe0uEZAcFCd_FCeD4kTsyflAGtRd2e76VPIr9j_9DUHUh4k</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language [version 1; peer review: 1 approved with reservations]</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Mas-Sandoval, Alex ; Jin, Chenyu ; Fracassetti, Marco ; Fumagalli, Matteo</creator><creatorcontrib>Mas-Sandoval, Alex ; Jin, Chenyu ; Fracassetti, Marco ; Fumagalli, Matteo</creatorcontrib><description>A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model
organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read
sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicabil-
ity of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia</description><identifier>ISSN: 2046-1402</identifier><identifier>EISSN: 2046-1402</identifier><identifier>DOI: 10.12688/f1000research.104368.1</identifier><language>eng</language><ispartof>F1000 research, 2022, Vol.11, p.126</ispartof><rights>Copyright: © 2022 Mas-Sandoval A et al.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c1491-864927baafef2aefbd0042032401270f0af10cba1d0bc7c1e413370942cf88933</cites><orcidid>0000-0002-4084-2953 ; 0000-0002-2962-2669 ; 0000-0002-1712-9404</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,4024,27923,27924,27925</link.rule.ids></links><search><creatorcontrib>Mas-Sandoval, Alex</creatorcontrib><creatorcontrib>Jin, Chenyu</creatorcontrib><creatorcontrib>Fracassetti, Marco</creatorcontrib><creatorcontrib>Fumagalli, Matteo</creatorcontrib><title>ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language [version 1; peer review: 1 approved with reservations]</title><title>F1000 research</title><description>A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model
organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read
sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicabil-
ity of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia</description><issn>2046-1402</issn><issn>2046-1402</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><recordid>eNp9kMtOwzAQRS0EElXpN-AfSBk7Vh7tqipvVbCBFULRxBmnQSEJdtLSL-F3aRoWsGE1I809V5rD2LmAqZBBFF0YAQCWHKHV66kA5QfRVByxkQQVeEKBPP61n7KJc297AuLYD2Q4Yl9V7u67ssAZb-qmK7Et6ornVFFbaI4VljtXOF4bXtFn6_UHO2QuHxbc0UdHlS6qnGfYIt8W7Zof6niJVd5hTvxlQ9b1gJjzhshyS5uCtjMuODaNrTeUDVz_hd0cyt3rGTsxWDqa_Mwxe76-elreeqvHm7vlYuVpoWLhRYGKZZgiGjISyaQZgJLgSwVChmAA94J0iiKDVIdakBK-H0KspDZRFPv-mIVDr7a1c5ZM0tjiHe0uEZAcFCd_FCeD4kTsyflAGtRd2e76VPIr9j_9DUHUh4k</recordid><startdate>2022</startdate><enddate>2022</enddate><creator>Mas-Sandoval, Alex</creator><creator>Jin, Chenyu</creator><creator>Fracassetti, Marco</creator><creator>Fumagalli, Matteo</creator><scope>C-E</scope><scope>CH4</scope><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0002-4084-2953</orcidid><orcidid>https://orcid.org/0000-0002-2962-2669</orcidid><orcidid>https://orcid.org/0000-0002-1712-9404</orcidid></search><sort><creationdate>2022</creationdate><title>ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language [version 1; peer review: 1 approved with reservations]</title><author>Mas-Sandoval, Alex ; Jin, Chenyu ; Fracassetti, Marco ; Fumagalli, Matteo</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c1491-864927baafef2aefbd0042032401270f0af10cba1d0bc7c1e413370942cf88933</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mas-Sandoval, Alex</creatorcontrib><creatorcontrib>Jin, Chenyu</creatorcontrib><creatorcontrib>Fracassetti, Marco</creatorcontrib><creatorcontrib>Fumagalli, Matteo</creatorcontrib><collection>F1000Research</collection><collection>Faculty of 1000</collection><collection>CrossRef</collection><jtitle>F1000 research</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mas-Sandoval, Alex</au><au>Jin, Chenyu</au><au>Fracassetti, Marco</au><au>Fumagalli, Matteo</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language [version 1; peer review: 1 approved with reservations]</atitle><jtitle>F1000 research</jtitle><date>2022</date><risdate>2022</risdate><volume>11</volume><spage>126</spage><pages>126-</pages><issn>2046-1402</issn><eissn>2046-1402</eissn><abstract>A sound analysis of DNA sequencing data is important to extract meaningful information and infer quantities of interest. Sequencing and mapping errors coupled with low and variable coverage hamper the identification of genotypes and variants and the estimation of population genetic parameters. Methods and implementations to estimate population genetic parameters from sequencing data available nowadays either are suitable for the analysis of genomes from model
organisms only, require moderate sequencing coverage, or are not easily adaptable to specific applications. To address these issues, we introduce ngsJulia, a collection of templates and functions in Julia language to process short-read
sequencing data for population genetic analysis. We further describe two implementations, ngsPool and ngsPloidy, for the analysis of pooled sequencing data and polyploid genomes, respectively. Through simulations, we illustrate the performance of estimating various population genetic parameters using these implementations, using both established and novel statistical methods. These results inform on optimal experimental design and demonstrate the applicabil-
ity of methods in ngsJulia to estimate parameters of interest even from low coverage sequencing data. ngsJulia provide users with a flexible and efficient framework for ad hoc analysis of sequencing data.ngsJulia is available from: https://github.com/mfumagalli/ngsJulia</abstract><doi>10.12688/f1000research.104368.1</doi><orcidid>https://orcid.org/0000-0002-4084-2953</orcidid><orcidid>https://orcid.org/0000-0002-2962-2669</orcidid><orcidid>https://orcid.org/0000-0002-1712-9404</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2046-1402 |
ispartof | F1000 research, 2022, Vol.11, p.126 |
issn | 2046-1402 2046-1402 |
language | eng |
recordid | cdi_crossref_primary_10_12688_f1000research_104368_1 |
source | Publicly Available Content Database; PubMed Central |
title | ngsJulia: population genetic analysis of next-generation DNA sequencing data with Julia language [version 1; peer review: 1 approved with reservations] |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-04T15%3A12%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-faculty1000_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=ngsJulia:%20population%20genetic%20analysis%20of%20next-generation%20DNA%20sequencing%20data%20with%20Julia%20language%20%5Bversion%201;%20peer%20review:%201%20approved%20with%20reservations%5D&rft.jtitle=F1000%20research&rft.au=Mas-Sandoval,%20Alex&rft.date=2022&rft.volume=11&rft.spage=126&rft.pages=126-&rft.issn=2046-1402&rft.eissn=2046-1402&rft_id=info:doi/10.12688/f1000research.104368.1&rft_dat=%3Cfaculty1000_cross%3E10_12688_f1000research_104368_1%3C/faculty1000_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c1491-864927baafef2aefbd0042032401270f0af10cba1d0bc7c1e413370942cf88933%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true |