Loading…

isolateR: an R package for generating microbial libraries from Sanger sequencing data

Sanger sequencing of taxonomic marker genes (e.g., 16S/18S/ITS/rpoB/cpn60) represents the leading method for identifying a wide range of microorganisms including bacteria, archaea, and fungi. However, the manual processing of sequence data and limitations associated with conventional BLAST searches...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics (Oxford, England) England), 2024-07, Vol.40 (7)
Main Authors: Daisley, Brendan, Vancuren, Sarah J, Brettingham, Dylan J L, Wilde, Jacob, Renwick, Simone, Macpherson, Christine, Good, David A, Botschner, Alexander J, Yen, Sandi, Hill, Janet E, Sorbara, Matthew T, Allen-Vercoe, Emma
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c244t-b8cf3a3d381e198f7342c5026710f82f6c0c040c0ce81e7170c0057089d2da7a3
container_end_page
container_issue 7
container_start_page
container_title Bioinformatics (Oxford, England)
container_volume 40
creator Daisley, Brendan
Vancuren, Sarah J
Brettingham, Dylan J L
Wilde, Jacob
Renwick, Simone
Macpherson, Christine
Good, David A
Botschner, Alexander J
Yen, Sandi
Hill, Janet E
Sorbara, Matthew T
Allen-Vercoe, Emma
description Sanger sequencing of taxonomic marker genes (e.g., 16S/18S/ITS/rpoB/cpn60) represents the leading method for identifying a wide range of microorganisms including bacteria, archaea, and fungi. However, the manual processing of sequence data and limitations associated with conventional BLAST searches impede the efficient generation of strain libraries essential for cataloging microbial diversity and discovering novel species. isolateR addresses these challenges by implementing a standardized and scalable three-step pipeline that includes: 1) automated batch processing of Sanger sequence files, 2) taxonomic classification via global alignment to type strain databases in accordance with the latest international nomenclature standards, and 3) straightforward creation of strain libraries and handling of clonal isolates, with the ability to set customizable sequence dereplication thresholds and combine data from multiple sequencing runs into a single library. The tool's user-friendly design also features interactive HTML outputs that simplify data exploration and analysis. Additionally, in silico benchmarking done on two comprehensive human gut genome catalogues (IMGG and Hadza hunter-gather populations) showcase the proficiency of isolateR in uncovering and cataloging the nuanced spectrum of microbial diversity, advocating for a more targeted and granular exploration within individual hosts to achieve the highest strain-level resolution possible when generating culture collections. isolateR is available at: https://github.com/bdaisley/isolateR. Supplementary data are available at Bioinformatics online.
doi_str_mv 10.1093/bioinformatics/btae448
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11254302</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3079171589</sourcerecordid><originalsourceid>FETCH-LOGICAL-c244t-b8cf3a3d381e198f7342c5026710f82f6c0c040c0ce81e7170c0057089d2da7a3</originalsourceid><addsrcrecordid>eNpVkVtLxDAQhYMoXlb_wpJHX1aTpm1SX0TEGwjC6j6HaTqt0TZZk67gvzfLrqIPIQNzzjeTHEKmnJ1xVonz2nrrWh8GGK2J5_UImOdqhxxyUcpZrjjf_VMfkKMY3xhjBSvKfXIgVFVxlalDsrDR9zDi_IKCo3O6BPMOHdKEph06DInvOjpYE3xtoae9rQMEi5G2wQ_0GVyHgUb8WKEza2kDIxyTvRb6iCfbe0IWtzcv1_ezx6e7h-urx5nJ8nyc1cq0AkQjFEdeqVaKPDMFy0rJWauytjTMsDwdg0khuUwlKyRTVZM1IEFMyOWGu1zVAzYG3Rig18tgBwhf2oPV_zvOvurOf2rOsyIXLEuE0y0h-PSEOOrBRoN9Dw79KmrBZMUlL1SVpOVGmr4ixoDt7xzO9DoU_T8UvQ0lGad_t_y1_aQgvgESgY-4</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3079171589</pqid></control><display><type>article</type><title>isolateR: an R package for generating microbial libraries from Sanger sequencing data</title><source>OUP_牛津大学出版社OA刊</source><source>PubMed Central</source><creator>Daisley, Brendan ; Vancuren, Sarah J ; Brettingham, Dylan J L ; Wilde, Jacob ; Renwick, Simone ; Macpherson, Christine ; Good, David A ; Botschner, Alexander J ; Yen, Sandi ; Hill, Janet E ; Sorbara, Matthew T ; Allen-Vercoe, Emma</creator><contributor>Schwartz, Russell</contributor><creatorcontrib>Daisley, Brendan ; Vancuren, Sarah J ; Brettingham, Dylan J L ; Wilde, Jacob ; Renwick, Simone ; Macpherson, Christine ; Good, David A ; Botschner, Alexander J ; Yen, Sandi ; Hill, Janet E ; Sorbara, Matthew T ; Allen-Vercoe, Emma ; Schwartz, Russell</creatorcontrib><description>Sanger sequencing of taxonomic marker genes (e.g., 16S/18S/ITS/rpoB/cpn60) represents the leading method for identifying a wide range of microorganisms including bacteria, archaea, and fungi. However, the manual processing of sequence data and limitations associated with conventional BLAST searches impede the efficient generation of strain libraries essential for cataloging microbial diversity and discovering novel species. isolateR addresses these challenges by implementing a standardized and scalable three-step pipeline that includes: 1) automated batch processing of Sanger sequence files, 2) taxonomic classification via global alignment to type strain databases in accordance with the latest international nomenclature standards, and 3) straightforward creation of strain libraries and handling of clonal isolates, with the ability to set customizable sequence dereplication thresholds and combine data from multiple sequencing runs into a single library. The tool's user-friendly design also features interactive HTML outputs that simplify data exploration and analysis. Additionally, in silico benchmarking done on two comprehensive human gut genome catalogues (IMGG and Hadza hunter-gather populations) showcase the proficiency of isolateR in uncovering and cataloging the nuanced spectrum of microbial diversity, advocating for a more targeted and granular exploration within individual hosts to achieve the highest strain-level resolution possible when generating culture collections. isolateR is available at: https://github.com/bdaisley/isolateR. Supplementary data are available at Bioinformatics online.</description><identifier>ISSN: 1367-4811</identifier><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>DOI: 10.1093/bioinformatics/btae448</identifier><identifier>PMID: 38991828</identifier><language>eng</language><publisher>England: Oxford University Press</publisher><subject>Original Paper</subject><ispartof>Bioinformatics (Oxford, England), 2024-07, Vol.40 (7)</ispartof><rights>The Author(s) 2024. Published by Oxford University Press.</rights><rights>The Author(s) 2024. Published by Oxford University Press. 2024</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c244t-b8cf3a3d381e198f7342c5026710f82f6c0c040c0ce81e7170c0057089d2da7a3</cites><orcidid>0000-0001-5999-5792 ; 0000-0002-8716-327X ; 0009-0000-5330-361X ; 0000-0002-2187-6277 ; 0000-0002-6145-9670 ; 0000-0003-1882-2910</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11254302/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC11254302/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,27924,27925,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38991828$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Schwartz, Russell</contributor><creatorcontrib>Daisley, Brendan</creatorcontrib><creatorcontrib>Vancuren, Sarah J</creatorcontrib><creatorcontrib>Brettingham, Dylan J L</creatorcontrib><creatorcontrib>Wilde, Jacob</creatorcontrib><creatorcontrib>Renwick, Simone</creatorcontrib><creatorcontrib>Macpherson, Christine</creatorcontrib><creatorcontrib>Good, David A</creatorcontrib><creatorcontrib>Botschner, Alexander J</creatorcontrib><creatorcontrib>Yen, Sandi</creatorcontrib><creatorcontrib>Hill, Janet E</creatorcontrib><creatorcontrib>Sorbara, Matthew T</creatorcontrib><creatorcontrib>Allen-Vercoe, Emma</creatorcontrib><title>isolateR: an R package for generating microbial libraries from Sanger sequencing data</title><title>Bioinformatics (Oxford, England)</title><addtitle>Bioinformatics</addtitle><description>Sanger sequencing of taxonomic marker genes (e.g., 16S/18S/ITS/rpoB/cpn60) represents the leading method for identifying a wide range of microorganisms including bacteria, archaea, and fungi. However, the manual processing of sequence data and limitations associated with conventional BLAST searches impede the efficient generation of strain libraries essential for cataloging microbial diversity and discovering novel species. isolateR addresses these challenges by implementing a standardized and scalable three-step pipeline that includes: 1) automated batch processing of Sanger sequence files, 2) taxonomic classification via global alignment to type strain databases in accordance with the latest international nomenclature standards, and 3) straightforward creation of strain libraries and handling of clonal isolates, with the ability to set customizable sequence dereplication thresholds and combine data from multiple sequencing runs into a single library. The tool's user-friendly design also features interactive HTML outputs that simplify data exploration and analysis. Additionally, in silico benchmarking done on two comprehensive human gut genome catalogues (IMGG and Hadza hunter-gather populations) showcase the proficiency of isolateR in uncovering and cataloging the nuanced spectrum of microbial diversity, advocating for a more targeted and granular exploration within individual hosts to achieve the highest strain-level resolution possible when generating culture collections. isolateR is available at: https://github.com/bdaisley/isolateR. Supplementary data are available at Bioinformatics online.</description><subject>Original Paper</subject><issn>1367-4811</issn><issn>1367-4803</issn><issn>1367-4811</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpVkVtLxDAQhYMoXlb_wpJHX1aTpm1SX0TEGwjC6j6HaTqt0TZZk67gvzfLrqIPIQNzzjeTHEKmnJ1xVonz2nrrWh8GGK2J5_UImOdqhxxyUcpZrjjf_VMfkKMY3xhjBSvKfXIgVFVxlalDsrDR9zDi_IKCo3O6BPMOHdKEph06DInvOjpYE3xtoae9rQMEi5G2wQ_0GVyHgUb8WKEza2kDIxyTvRb6iCfbe0IWtzcv1_ezx6e7h-urx5nJ8nyc1cq0AkQjFEdeqVaKPDMFy0rJWauytjTMsDwdg0khuUwlKyRTVZM1IEFMyOWGu1zVAzYG3Rig18tgBwhf2oPV_zvOvurOf2rOsyIXLEuE0y0h-PSEOOrBRoN9Dw79KmrBZMUlL1SVpOVGmr4ixoDt7xzO9DoU_T8UvQ0lGad_t_y1_aQgvgESgY-4</recordid><startdate>20240711</startdate><enddate>20240711</enddate><creator>Daisley, Brendan</creator><creator>Vancuren, Sarah J</creator><creator>Brettingham, Dylan J L</creator><creator>Wilde, Jacob</creator><creator>Renwick, Simone</creator><creator>Macpherson, Christine</creator><creator>Good, David A</creator><creator>Botschner, Alexander J</creator><creator>Yen, Sandi</creator><creator>Hill, Janet E</creator><creator>Sorbara, Matthew T</creator><creator>Allen-Vercoe, Emma</creator><general>Oxford University Press</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>5PM</scope><orcidid>https://orcid.org/0000-0001-5999-5792</orcidid><orcidid>https://orcid.org/0000-0002-8716-327X</orcidid><orcidid>https://orcid.org/0009-0000-5330-361X</orcidid><orcidid>https://orcid.org/0000-0002-2187-6277</orcidid><orcidid>https://orcid.org/0000-0002-6145-9670</orcidid><orcidid>https://orcid.org/0000-0003-1882-2910</orcidid></search><sort><creationdate>20240711</creationdate><title>isolateR: an R package for generating microbial libraries from Sanger sequencing data</title><author>Daisley, Brendan ; Vancuren, Sarah J ; Brettingham, Dylan J L ; Wilde, Jacob ; Renwick, Simone ; Macpherson, Christine ; Good, David A ; Botschner, Alexander J ; Yen, Sandi ; Hill, Janet E ; Sorbara, Matthew T ; Allen-Vercoe, Emma</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c244t-b8cf3a3d381e198f7342c5026710f82f6c0c040c0ce81e7170c0057089d2da7a3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Original Paper</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Daisley, Brendan</creatorcontrib><creatorcontrib>Vancuren, Sarah J</creatorcontrib><creatorcontrib>Brettingham, Dylan J L</creatorcontrib><creatorcontrib>Wilde, Jacob</creatorcontrib><creatorcontrib>Renwick, Simone</creatorcontrib><creatorcontrib>Macpherson, Christine</creatorcontrib><creatorcontrib>Good, David A</creatorcontrib><creatorcontrib>Botschner, Alexander J</creatorcontrib><creatorcontrib>Yen, Sandi</creatorcontrib><creatorcontrib>Hill, Janet E</creatorcontrib><creatorcontrib>Sorbara, Matthew T</creatorcontrib><creatorcontrib>Allen-Vercoe, Emma</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Bioinformatics (Oxford, England)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Daisley, Brendan</au><au>Vancuren, Sarah J</au><au>Brettingham, Dylan J L</au><au>Wilde, Jacob</au><au>Renwick, Simone</au><au>Macpherson, Christine</au><au>Good, David A</au><au>Botschner, Alexander J</au><au>Yen, Sandi</au><au>Hill, Janet E</au><au>Sorbara, Matthew T</au><au>Allen-Vercoe, Emma</au><au>Schwartz, Russell</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>isolateR: an R package for generating microbial libraries from Sanger sequencing data</atitle><jtitle>Bioinformatics (Oxford, England)</jtitle><addtitle>Bioinformatics</addtitle><date>2024-07-11</date><risdate>2024</risdate><volume>40</volume><issue>7</issue><issn>1367-4811</issn><issn>1367-4803</issn><eissn>1367-4811</eissn><abstract>Sanger sequencing of taxonomic marker genes (e.g., 16S/18S/ITS/rpoB/cpn60) represents the leading method for identifying a wide range of microorganisms including bacteria, archaea, and fungi. However, the manual processing of sequence data and limitations associated with conventional BLAST searches impede the efficient generation of strain libraries essential for cataloging microbial diversity and discovering novel species. isolateR addresses these challenges by implementing a standardized and scalable three-step pipeline that includes: 1) automated batch processing of Sanger sequence files, 2) taxonomic classification via global alignment to type strain databases in accordance with the latest international nomenclature standards, and 3) straightforward creation of strain libraries and handling of clonal isolates, with the ability to set customizable sequence dereplication thresholds and combine data from multiple sequencing runs into a single library. The tool's user-friendly design also features interactive HTML outputs that simplify data exploration and analysis. Additionally, in silico benchmarking done on two comprehensive human gut genome catalogues (IMGG and Hadza hunter-gather populations) showcase the proficiency of isolateR in uncovering and cataloging the nuanced spectrum of microbial diversity, advocating for a more targeted and granular exploration within individual hosts to achieve the highest strain-level resolution possible when generating culture collections. isolateR is available at: https://github.com/bdaisley/isolateR. Supplementary data are available at Bioinformatics online.</abstract><cop>England</cop><pub>Oxford University Press</pub><pmid>38991828</pmid><doi>10.1093/bioinformatics/btae448</doi><orcidid>https://orcid.org/0000-0001-5999-5792</orcidid><orcidid>https://orcid.org/0000-0002-8716-327X</orcidid><orcidid>https://orcid.org/0009-0000-5330-361X</orcidid><orcidid>https://orcid.org/0000-0002-2187-6277</orcidid><orcidid>https://orcid.org/0000-0002-6145-9670</orcidid><orcidid>https://orcid.org/0000-0003-1882-2910</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1367-4811
ispartof Bioinformatics (Oxford, England), 2024-07, Vol.40 (7)
issn 1367-4811
1367-4803
1367-4811
language eng
recordid cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_11254302
source OUP_牛津大学出版社OA刊; PubMed Central
subjects Original Paper
title isolateR: an R package for generating microbial libraries from Sanger sequencing data
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T13%3A28%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=isolateR:%20an%20R%20package%20for%20generating%20microbial%20libraries%20from%20Sanger%20sequencing%20data&rft.jtitle=Bioinformatics%20(Oxford,%20England)&rft.au=Daisley,%20Brendan&rft.date=2024-07-11&rft.volume=40&rft.issue=7&rft.issn=1367-4811&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/btae448&rft_dat=%3Cproquest_pubme%3E3079171589%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c244t-b8cf3a3d381e198f7342c5026710f82f6c0c040c0ce81e7170c0057089d2da7a3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3079171589&rft_id=info:pmid/38991828&rfr_iscdi=true