Loading…

PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data

Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require re...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics 2020-03, Vol.21 (1), p.99-99, Article 99
Main Authors: Zhang, Jianbo, Panthee, Dilip R
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53
cites cdi_FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53
container_end_page 99
container_issue 1
container_start_page 99
container_title BMC bioinformatics
container_volume 21
creator Zhang, Jianbo
Panthee, Dilip R
description Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect significant single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost. We developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python; the program was named PyBSASeq. Using PyBSASeq, the significant SNPs (sSNPs), SNPs likely associated with the trait, were identified via Fisher's exact test, and then the ratio of the sSNPs to total SNPs in a chromosomal interval was used to detect the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated via the current methods, but with more than five times higher sensitivity. This approach was termed the significant SNP method here. The significant SNP method allows the detection of SNP-trait associations at much lower sequencing coverage than the current methods, leading to ~ 80% lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome.
doi_str_mv 10.1186/s12859-020-3435-8
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_ce662acfd3d5434aa2cca40109d1135e</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A617187228</galeid><doaj_id>oai_doaj_org_article_ce662acfd3d5434aa2cca40109d1135e</doaj_id><sourcerecordid>A617187228</sourcerecordid><originalsourceid>FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53</originalsourceid><addsrcrecordid>eNptkktv1DAUhSMEoqXwA9igSGxgkWLHz7BAGioeI1UCMd1bN851xkMSt3HSMv8eT6eUDkJe-PWdY_noZNlLSk4p1fJdpKUWVUFKUjDORKEfZceUK1qUlIjHD9ZH2bMYN4RQpYl4mh2xkiZe8eNs8337cbVY4dX7HPLo-8sOcxiaHJ1DO_nrtOvaMPpp3ecujHk9dz-xySO2I7YwTAmGbht9zG8Sk9-sQ4dFi0PoMUFXMw7WD23ewATPsycOuogv7uaT7OLzp4uzr8X5ty_Ls8V5YUXFp6IUoJRWyK0TtSQ16MpRrDm1ggBUpHa1UI47yZiSSlfAa-psKakERZ1gJ9lyb9sE2JjL0fcwbk0Ab24PwtgaGCdvOzQWpSzBuoY1gjMOUFoLnFBSNZQygcnrw97rcq57bCwO0wjdgenhzeDXpg3XRhFJhCqTwZs7gzGkMOJkeh8tdh0MGOZoSqY4o5JrltDX_6CbMI8p3VtKS6appH-pFtIH_OBCetfuTM1CUkW1KkudqNP_UGk02HsbBnQ-nR8I3h4IEjPhr6mFOUazXP04ZOmetWOIcUR3nwclZtdLs--lSb00u16anebVwyDvFX-KyH4DR4rckg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2378638161</pqid></control><display><type>article</type><title>PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data</title><source>Publicly Available Content Database</source><source>PubMed Central</source><creator>Zhang, Jianbo ; Panthee, Dilip R</creator><creatorcontrib>Zhang, Jianbo ; Panthee, Dilip R</creatorcontrib><description>Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect significant single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost. We developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python; the program was named PyBSASeq. Using PyBSASeq, the significant SNPs (sSNPs), SNPs likely associated with the trait, were identified via Fisher's exact test, and then the ratio of the sSNPs to total SNPs in a chromosomal interval was used to detect the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated via the current methods, but with more than five times higher sensitivity. This approach was termed the significant SNP method here. The significant SNP method allows the detection of SNP-trait associations at much lower sequencing coverage than the current methods, leading to ~ 80% lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-020-3435-8</identifier><identifier>PMID: 32143574</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Algorithms ; Bulked segregant analysis, BSA-Seq ; Computer applications ; Databases, Genetic ; Datasets ; DNA sequencing ; Gene mapping ; Gene sequencing ; Genetic aspects ; Genetic polymorphisms ; Genomes ; Genomics ; Genotype &amp; phenotype ; Hypotheses ; Information management ; Methods ; Oryza - genetics ; Polymorphism, Single Nucleotide ; PyBSASeq ; QTL ; Quantitative genetics ; Quantitative Trait Loci ; Simulation ; Single nucleotide polymorphisms ; SNP-trait association ; Software ; Whole Genome Sequencing</subject><ispartof>BMC bioinformatics, 2020-03, Vol.21 (1), p.99-99, Article 99</ispartof><rights>COPYRIGHT 2020 BioMed Central Ltd.</rights><rights>2020. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s). 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53</citedby><cites>FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53</cites><orcidid>0000-0002-8736-512X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7060572/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2378638161?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,25731,27901,27902,36989,36990,44566,53766,53768</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/32143574$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Zhang, Jianbo</creatorcontrib><creatorcontrib>Panthee, Dilip R</creatorcontrib><title>PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect significant single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost. We developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python; the program was named PyBSASeq. Using PyBSASeq, the significant SNPs (sSNPs), SNPs likely associated with the trait, were identified via Fisher's exact test, and then the ratio of the sSNPs to total SNPs in a chromosomal interval was used to detect the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated via the current methods, but with more than five times higher sensitivity. This approach was termed the significant SNP method here. The significant SNP method allows the detection of SNP-trait associations at much lower sequencing coverage than the current methods, leading to ~ 80% lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome.</description><subject>Algorithms</subject><subject>Bulked segregant analysis, BSA-Seq</subject><subject>Computer applications</subject><subject>Databases, Genetic</subject><subject>Datasets</subject><subject>DNA sequencing</subject><subject>Gene mapping</subject><subject>Gene sequencing</subject><subject>Genetic aspects</subject><subject>Genetic polymorphisms</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Genotype &amp; phenotype</subject><subject>Hypotheses</subject><subject>Information management</subject><subject>Methods</subject><subject>Oryza - genetics</subject><subject>Polymorphism, Single Nucleotide</subject><subject>PyBSASeq</subject><subject>QTL</subject><subject>Quantitative genetics</subject><subject>Quantitative Trait Loci</subject><subject>Simulation</subject><subject>Single nucleotide polymorphisms</subject><subject>SNP-trait association</subject><subject>Software</subject><subject>Whole Genome Sequencing</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptkktv1DAUhSMEoqXwA9igSGxgkWLHz7BAGioeI1UCMd1bN851xkMSt3HSMv8eT6eUDkJe-PWdY_noZNlLSk4p1fJdpKUWVUFKUjDORKEfZceUK1qUlIjHD9ZH2bMYN4RQpYl4mh2xkiZe8eNs8337cbVY4dX7HPLo-8sOcxiaHJ1DO_nrtOvaMPpp3ecujHk9dz-xySO2I7YwTAmGbht9zG8Sk9-sQ4dFi0PoMUFXMw7WD23ewATPsycOuogv7uaT7OLzp4uzr8X5ty_Ls8V5YUXFp6IUoJRWyK0TtSQ16MpRrDm1ggBUpHa1UI47yZiSSlfAa-psKakERZ1gJ9lyb9sE2JjL0fcwbk0Ab24PwtgaGCdvOzQWpSzBuoY1gjMOUFoLnFBSNZQygcnrw97rcq57bCwO0wjdgenhzeDXpg3XRhFJhCqTwZs7gzGkMOJkeh8tdh0MGOZoSqY4o5JrltDX_6CbMI8p3VtKS6appH-pFtIH_OBCetfuTM1CUkW1KkudqNP_UGk02HsbBnQ-nR8I3h4IEjPhr6mFOUazXP04ZOmetWOIcUR3nwclZtdLs--lSb00u16anebVwyDvFX-KyH4DR4rckg</recordid><startdate>20200306</startdate><enddate>20200306</enddate><creator>Zhang, Jianbo</creator><creator>Panthee, Dilip R</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AEUYN</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PJZUB</scope><scope>PKEHL</scope><scope>PPXIY</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0002-8736-512X</orcidid></search><sort><creationdate>20200306</creationdate><title>PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data</title><author>Zhang, Jianbo ; Panthee, Dilip R</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Algorithms</topic><topic>Bulked segregant analysis, BSA-Seq</topic><topic>Computer applications</topic><topic>Databases, Genetic</topic><topic>Datasets</topic><topic>DNA sequencing</topic><topic>Gene mapping</topic><topic>Gene sequencing</topic><topic>Genetic aspects</topic><topic>Genetic polymorphisms</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Genotype &amp; phenotype</topic><topic>Hypotheses</topic><topic>Information management</topic><topic>Methods</topic><topic>Oryza - genetics</topic><topic>Polymorphism, Single Nucleotide</topic><topic>PyBSASeq</topic><topic>QTL</topic><topic>Quantitative genetics</topic><topic>Quantitative Trait Loci</topic><topic>Simulation</topic><topic>Single nucleotide polymorphisms</topic><topic>SNP-trait association</topic><topic>Software</topic><topic>Whole Genome Sequencing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Jianbo</creatorcontrib><creatorcontrib>Panthee, Dilip R</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest One Sustainability</collection><collection>ProQuest Central UK/Ireland</collection><collection>Advanced Technologies &amp; Aerospace Database‎ (1962 - current)</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Biological Sciences</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Biological Science Database</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest Health &amp; Medical Research Collection</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Health &amp; Nursing</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied &amp; Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Jianbo</au><au>Panthee, Dilip R</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2020-03-06</date><risdate>2020</risdate><volume>21</volume><issue>1</issue><spage>99</spage><epage>99</epage><pages>99-99</pages><artnum>99</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Bulked segregant analysis (BSA), coupled with next-generation sequencing, allows the rapid identification of both qualitative and quantitative trait loci (QTL), and this technique is referred to as BSA-Seq here. The current SNP index method and G-statistic method for BSA-Seq data analysis require relatively high sequencing coverage to detect significant single nucleotide polymorphism (SNP)-trait associations, which leads to high sequencing cost. We developed a simple and effective algorithm for BSA-Seq data analysis and implemented it in Python; the program was named PyBSASeq. Using PyBSASeq, the significant SNPs (sSNPs), SNPs likely associated with the trait, were identified via Fisher's exact test, and then the ratio of the sSNPs to total SNPs in a chromosomal interval was used to detect the genomic regions that condition the trait of interest. The results obtained this way are similar to those generated via the current methods, but with more than five times higher sensitivity. This approach was termed the significant SNP method here. The significant SNP method allows the detection of SNP-trait associations at much lower sequencing coverage than the current methods, leading to ~ 80% lower sequencing cost and making BSA-Seq more accessible to the research community and more applicable to the species with a large genome.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>32143574</pmid><doi>10.1186/s12859-020-3435-8</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0002-8736-512X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2020-03, Vol.21 (1), p.99-99, Article 99
issn 1471-2105
1471-2105
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_ce662acfd3d5434aa2cca40109d1135e
source Publicly Available Content Database; PubMed Central
subjects Algorithms
Bulked segregant analysis, BSA-Seq
Computer applications
Databases, Genetic
Datasets
DNA sequencing
Gene mapping
Gene sequencing
Genetic aspects
Genetic polymorphisms
Genomes
Genomics
Genotype & phenotype
Hypotheses
Information management
Methods
Oryza - genetics
Polymorphism, Single Nucleotide
PyBSASeq
QTL
Quantitative genetics
Quantitative Trait Loci
Simulation
Single nucleotide polymorphisms
SNP-trait association
Software
Whole Genome Sequencing
title PyBSASeq: a simple and effective algorithm for bulked segregant analysis with whole-genome sequencing data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-22T23%3A34%3A33IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=PyBSASeq:%20a%20simple%20and%20effective%20algorithm%20for%20bulked%20segregant%20analysis%20with%20whole-genome%20sequencing%20data&rft.jtitle=BMC%20bioinformatics&rft.au=Zhang,%20Jianbo&rft.date=2020-03-06&rft.volume=21&rft.issue=1&rft.spage=99&rft.epage=99&rft.pages=99-99&rft.artnum=99&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-020-3435-8&rft_dat=%3Cgale_doaj_%3EA617187228%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c594t-25a7787e4cf5b60ba89f1eb41c50aa90bfb57f4f63376789a4b1fc2616a71f53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2378638161&rft_id=info:pmid/32143574&rft_galeid=A617187228&rfr_iscdi=true