Loading…
PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data
Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require re...
Saved in:
Published in: | BMC bioinformatics 2020-03, Vol.21 (1), p.99-99, Article 99 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53 |
---|---|
cites | cdi_FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53 |
container_end_page | 99 |
container_issue | 1 |
container_start_page | 99 |
container_title | BMC bioinformatics |
container_volume | 21 |
creator | Zhang, Jianbo Panthee, Dilip R |
description | Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect significant single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost.
We developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python; the program was named PyBSASeq. Using PyBSASeq, the significant SNPs (sSNPs), SNPs likely associated with the trait, were identified via Fisher's exact test, and then the ratio of the sSNPs to total SNPs in a chromosomal interval was used to detect the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated via the current methods, but with more than five times higher sensitivity. This approach was termed the significant SNP method here.
The significant SNP method allows the detection of SNP-trait associations at much lower sequencing coverage than the current methods, leading to ~ 80% lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome. |
doi_str_mv | 10.1186/s12859-020-3435-8 |
format | article |
fullrecord | <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_ce662acfd3d5434aa2cca40109d1135e</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A617187228</galeid><doaj_id>oai_doaj_org_article_ce662acfd3d5434aa2cca40109d1135e</doaj_id><sourcerecordid>A617187228</sourcerecordid><originalsourceid>FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53</originalsourceid><addsrcrecordid>eNptkktv1DAUhSMEoqXwA9igSGxgkWLHz7BAGioeI1UCMd1bN851xkMSt3HSMv8eT6eUDkJe-PWdY_noZNlLSk4p1fJdpKUWVUFKUjDORKEfZceUK1qUlIjHD9ZH2bMYN4RQpYl4mh2xkiZe8eNs8337cbVY4dX7HPLo-8sOcxiaHJ1DO_nrtOvaMPpp3ecujHk9dz-xySO2I7YwTAmGbht9zG8Sk9-sQ4dFi0PoMUFXMw7WD23ewATPsycOuogv7uaT7OLzp4uzr8X5ty_Ls8V5YUXFp6IUoJRWyK0TtSQ16MpRrDm1ggBUpHa1UI47yZiSSlfAa-psKakERZ1gJ9lyb9sE2JjL0fcwbk0Ab24PwtgaGCdvOzQWpSzBuoY1gjMOUFoLnFBSNZQygcnrw97rcq57bCwO0wjdgenhzeDXpg3XRhFJhCqTwZs7gzGkMOJkeh8tdh0MGOZoSqY4o5JrltDX_6CbMI8p3VtKS6appH-pFtIH_OBCetfuTM1CUkW1KkudqNP_UGk02HsbBnQ-nR8I3h4IEjPhr6mFOUazXP04ZOmetWOIcUR3nwclZtdLs--lSb00u16anebVwyDvFX-KyH4DR4rckg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2378638161</pqid></control><display><type>article</type><title>PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Zhang, Jianbo ; Panthee, Dilip R</creator><creatorcontrib>Zhang, Jianbo ; Panthee, Dilip R</creatorcontrib><description>Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect significant single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost.
We developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python; the program was named PyBSASeq. Using PyBSASeq, the significant SNPs (sSNPs), SNPs likely associated with the trait, were identified via Fisher's exact test, and then the ratio of the sSNPs to total SNPs in a chromosomal interval was used to detect the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated via the current methods, but with more than five times higher sensitivity. This approach was termed the significant SNP method here.
The significant SNP method allows the detection of SNP-trait associations at much lower sequencing coverage than the current methods, leading to ~ 80% lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-020-3435-8</identifier><identifier>PMID: 32143574</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Algorithms ; Bulked segregant analysis, BSA-Seq ; Computer applications ; Databases, Genetic ; Datasets ; DNA sequencing ; Gene mapping ; Gene sequencing ; Genetic aspects ; Genetic polymorphisms ; Genomes ; Genomics ; Genotype & phenotype ; Hypotheses ; Information management ; Methods ; Oryza - genetics ; Polymorphism, Single Nucleotide ; PyBSASeq ; QTL ; Quantitative genetics ; Quantitative Trait Loci ; Simulation ; Single nucleotide polymorphisms ; SNP-trait association ; Software ; Whole Genome Sequencing</subject><ispartof>BMC bioinformatics, 2020-03, Vol.21 (1), p.99-99, Article 99</ispartof><rights>COPYRIGHT 2020 BioMed Central Ltd.</rights><rights>2020. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s). 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53</citedby><cites>FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53</cites><orcidid>0000-0002-8736-512X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7060572/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2378638161?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25731,27901,27902,36989,36990,44566,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32143574$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Jianbo</creatorcontrib><creatorcontrib>Panthee, Dilip R</creatorcontrib><title>PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect significant single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost.
We developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python; the program was named PyBSASeq. Using PyBSASeq, the significant SNPs (sSNPs), SNPs likely associated with the trait, were identified via Fisher's exact test, and then the ratio of the sSNPs to total SNPs in a chromosomal interval was used to detect the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated via the current methods, but with more than five times higher sensitivity. This approach was termed the significant SNP method here.
The significant SNP method allows the detection of SNP-trait associations at much lower sequencing coverage than the current methods, leading to ~ 80% lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome.</description><subject>Algorithms</subject><subject>Bulked segregant analysis, BSA-Seq</subject><subject>Computer applications</subject><subject>Databases, Genetic</subject><subject>Datasets</subject><subject>DNA sequencing</subject><subject>Gene mapping</subject><subject>Gene sequencing</subject><subject>Genetic aspects</subject><subject>Genetic polymorphisms</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Genotype & phenotype</subject><subject>Hypotheses</subject><subject>Information management</subject><subject>Methods</subject><subject>Oryza - genetics</subject><subject>Polymorphism, Single Nucleotide</subject><subject>PyBSASeq</subject><subject>QTL</subject><subject>Quantitative genetics</subject><subject>Quantitative Trait Loci</subject><subject>Simulation</subject><subject>Single nucleotide polymorphisms</subject><subject>SNP-trait association</subject><subject>Software</subject><subject>Whole Genome Sequencing</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptkktv1DAUhSMEoqXwA9igSGxgkWLHz7BAGioeI1UCMd1bN851xkMSt3HSMv8eT6eUDkJe-PWdY_noZNlLSk4p1fJdpKUWVUFKUjDORKEfZceUK1qUlIjHD9ZH2bMYN4RQpYl4mh2xkiZe8eNs8337cbVY4dX7HPLo-8sOcxiaHJ1DO_nrtOvaMPpp3ecujHk9dz-xySO2I7YwTAmGbht9zG8Sk9-sQ4dFi0PoMUFXMw7WD23ewATPsycOuogv7uaT7OLzp4uzr8X5ty_Ls8V5YUXFp6IUoJRWyK0TtSQ16MpRrDm1ggBUpHa1UI47yZiSSlfAa-psKakERZ1gJ9lyb9sE2JjL0fcwbk0Ab24PwtgaGCdvOzQWpSzBuoY1gjMOUFoLnFBSNZQygcnrw97rcq57bCwO0wjdgenhzeDXpg3XRhFJhCqTwZs7gzGkMOJkeh8tdh0MGOZoSqY4o5JrltDX_6CbMI8p3VtKS6appH-pFtIH_OBCetfuTM1CUkW1KkudqNP_UGk02HsbBnQ-nR8I3h4IEjPhr6mFOUazXP04ZOmetWOIcUR3nwclZtdLs--lSb00u16anebVwyDvFX-KyH4DR4rckg</recordid><startdate>20200306</startdate><enddate>20200306</enddate><creator>Zhang, Jianbo</creator><creator>Panthee, Dilip R</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PJZUB</scope><scope>PKEHL</scope><scope>PPXIY</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-8736-512X</orcidid></search><sort><creationdate>20200306</creationdate><title>PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data</title><author>Zhang, Jianbo ; Panthee, Dilip R</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Bulked segregant analysis, BSA-Seq</topic><topic>Computer applications</topic><topic>Databases, Genetic</topic><topic>Datasets</topic><topic>DNA sequencing</topic><topic>Gene mapping</topic><topic>Gene sequencing</topic><topic>Genetic aspects</topic><topic>Genetic polymorphisms</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Genotype & phenotype</topic><topic>Hypotheses</topic><topic>Information management</topic><topic>Methods</topic><topic>Oryza - genetics</topic><topic>Polymorphism, Single Nucleotide</topic><topic>PyBSASeq</topic><topic>QTL</topic><topic>Quantitative genetics</topic><topic>Quantitative Trait Loci</topic><topic>Simulation</topic><topic>Single nucleotide polymorphisms</topic><topic>SNP-trait association</topic><topic>Software</topic><topic>Whole Genome Sequencing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Jianbo</creatorcontrib><creatorcontrib>Panthee, Dilip R</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies & Aerospace Database (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Biological Science Database</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest Health & Medical Research Collection</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Health & Nursing</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Jianbo</au><au>Panthee, Dilip R</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2020-03-06</date><risdate>2020</risdate><volume>21</volume><issue>1</issue><spage>99</spage><epage>99</epage><pages>99-99</pages><artnum>99</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect significant single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost.
We developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python; the program was named PyBSASeq. Using PyBSASeq, the significant SNPs (sSNPs), SNPs likely associated with the trait, were identified via Fisher's exact test, and then the ratio of the sSNPs to total SNPs in a chromosomal interval was used to detect the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated via the current methods, but with more than five times higher sensitivity. This approach was termed the significant SNP method here.
The significant SNP method allows the detection of SNP-trait associations at much lower sequencing coverage than the current methods, leading to ~ 80% lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>32143574</pmid><doi>10.1186/s12859-020-3435-8</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-8736-512X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2020-03, Vol.21 (1), p.99-99, Article 99 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_ce662acfd3d5434aa2cca40109d1135e |
source | Publicly Available Content Database; PubMed Central |
subjects | Algorithms Bulked segregant analysis, BSA-Seq Computer applications Databases, Genetic Datasets DNA sequencing Gene mapping Gene sequencing Genetic aspects Genetic polymorphisms Genomes Genomics Genotype & phenotype Hypotheses Information management Methods Oryza - genetics Polymorphism, Single Nucleotide PyBSASeq QTL Quantitative genetics Quantitative Trait Loci Simulation Single nucleotide polymorphisms SNP-trait association Software Whole Genome Sequencing |
title | PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-22T23%3A34%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PyBSASeq:%20a%20simple%20and%20effective%20algorithm%20for%20bulked%20segregant%20analysis%20with%20whole-genome%20sequencing%20data&rft.jtitle=BMC%20bioinformatics&rft.au=Zhang,%20Jianbo&rft.date=2020-03-06&rft.volume=21&rft.issue=1&rft.spage=99&rft.epage=99&rft.pages=99-99&rft.artnum=99&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-020-3435-8&rft_dat=%3Cgale_doaj_%3EA617187228%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2378638161&rft_id=info:pmid/32143574&rft_galeid=A617187228&rfr_iscdi=true |