Loading…
cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets
The quantity of genomic data is expanding at an increasing rate. Tools for phylogenetic analysis which scale to the quantity of available data are required. To address this need, we present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic ana...
Saved in:
Published in: | BMC bioinformatics 2021-02, Vol.22 (1), p.70-70, Article 70 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c597t-fce6c6be95b5cf501e90c7c0e367859a4b3fd8f2391115a5de6cfddc6c05b7a93 |
---|---|
cites | cdi_FETCH-LOGICAL-c597t-fce6c6be95b5cf501e90c7c0e367859a4b3fd8f2391115a5de6cfddc6c05b7a93 |
container_end_page | 70 |
container_issue | 1 |
container_start_page | 70 |
container_title | BMC bioinformatics |
container_volume | 22 |
creator | Crawford, Ryan D Snitkin, Evan S |
description | The quantity of genomic data is expanding at an increasing rate. Tools for phylogenetic analysis which scale to the quantity of available data are required. To address this need, we present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic analysis.
We illustrate that cognac is able to rapidly identify phylogenetic marker genes using a data driven approach and efficiently generate concatenated gene alignments for very large genomic datasets. To benchmark our tool, we generated core gene alignments for eight unique genera of bacteria, including a dataset of over 11,000 genomes from the genus Escherichia producing an alignment with 1353 genes, which was constructed in less than 17 h.
We demonstrate that cognac presents an efficient method for generating concatenated gene alignments for phylogenetic analysis. We have released cognac as an R package ( https://github.com/rdcrawford/cognac ) with customizable parameters for adaptation to diverse applications. |
doi_str_mv | 10.1186/s12859-021-03981-4 |
format | article |
fullrecord | <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_d015fd73dff04d3dabe8e331920223f2</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A653602484</galeid><doaj_id>oai_doaj_org_article_d015fd73dff04d3dabe8e331920223f2</doaj_id><sourcerecordid>A653602484</sourcerecordid><originalsourceid>FETCH-LOGICAL-c597t-fce6c6be95b5cf501e90c7c0e367859a4b3fd8f2391115a5de6cfddc6c05b7a93</originalsourceid><addsrcrecordid>eNptkl2L1DAUhoso7jr6B7yQgDcKdk2apk33QlgWPwYWBD-uw2ly0snQJmPSUfcv-KvNzKzrjkgoKTnPeUPe8xbFU0bPGJPN68QqKbqSVqykvJOsrO8Vp6xuWVkxKu7f-T8pHqW0ppS1koqHxQnnQspW8NPilw6DB31OImycIQN6jDC74EmwRAevYUafv0OJwOgGP6GfE7Ehks3qegy7wuw0cd5iRK-R2BgmMkIc8BXpQc8YHYzkxyqMuJMJE5KE37aZdX4gBmZIOKfHxQMLY8InN_ui-Pru7ZfLD-XVx_fLy4urUouunUursdFNj53ohbaCMuyobjVF3rTZDqh7bo20Fe8YYwKEybg1Rjeair6Fji-K5UHXBFirTXQTxGsVwKn9QYiDgpgfNKIylAlrWm6spbXhBnqUyDnrKlpVPN-xKN4ctDbbfkKjszMRxiPR44p3KzWE76qVUvBaZIEXNwIxZEfSrCaXNI4jeAzbpKq6o6yqJZMZff4Pug7b6LNVe6pjVR7qX2qA_IA8k5Dv1TtRddEI3tAsVmfq7D9UXgYnl8eO1uXzo4aXRw2ZmfHnPMA2JbX8_OmYrQ6sjiGliPbWD0bVLrnqkFyVk6v2yVW7pmd3nbxt-RNV_hs-_ur8</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2490912588</pqid></control><display><type>article</type><title>cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets</title><source>Publicly Available Content (ProQuest)</source><source>PubMed Central</source><creator>Crawford, Ryan D ; Snitkin, Evan S</creator><creatorcontrib>Crawford, Ryan D ; Snitkin, Evan S</creatorcontrib><description>The quantity of genomic data is expanding at an increasing rate. Tools for phylogenetic analysis which scale to the quantity of available data are required. To address this need, we present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic analysis.
We illustrate that cognac is able to rapidly identify phylogenetic marker genes using a data driven approach and efficiently generate concatenated gene alignments for very large genomic datasets. To benchmark our tool, we generated core gene alignments for eight unique genera of bacteria, including a dataset of over 11,000 genomes from the genus Escherichia producing an alignment with 1353 genes, which was constructed in less than 17 h.
We demonstrate that cognac presents an efficient method for generating concatenated gene alignments for phylogenetic analysis. We have released cognac as an R package ( https://github.com/rdcrawford/cognac ) with customizable parameters for adaptation to diverse applications.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/s12859-021-03981-4</identifier><identifier>PMID: 33588753</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Amino acids ; Analysis ; Bacteria ; Bacteria - classification ; Bacteria - genetics ; Bacterial genetics ; Biology ; Concatenated gene tree ; Core genome ; Databases, Genetic ; Datasets ; Family Characteristics ; Gene sequencing ; Genes ; Genome, Bacterial ; Genomes ; Multiple sequence alignment ; Phylogenetics ; Phylogeny ; Prokaryotes ; Software ; Trees ; Whole Genome Sequencing</subject><ispartof>BMC bioinformatics, 2021-02, Vol.22 (1), p.70-70, Article 70</ispartof><rights>COPYRIGHT 2021 BioMed Central Ltd.</rights><rights>2021. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c597t-fce6c6be95b5cf501e90c7c0e367859a4b3fd8f2391115a5de6cfddc6c05b7a93</citedby><cites>FETCH-LOGICAL-c597t-fce6c6be95b5cf501e90c7c0e367859a4b3fd8f2391115a5de6cfddc6c05b7a93</cites><orcidid>0000-0001-8409-278X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7885345/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2490912588?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33588753$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Crawford, Ryan D</creatorcontrib><creatorcontrib>Snitkin, Evan S</creatorcontrib><title>cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>The quantity of genomic data is expanding at an increasing rate. Tools for phylogenetic analysis which scale to the quantity of available data are required. To address this need, we present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic analysis.
We illustrate that cognac is able to rapidly identify phylogenetic marker genes using a data driven approach and efficiently generate concatenated gene alignments for very large genomic datasets. To benchmark our tool, we generated core gene alignments for eight unique genera of bacteria, including a dataset of over 11,000 genomes from the genus Escherichia producing an alignment with 1353 genes, which was constructed in less than 17 h.
We demonstrate that cognac presents an efficient method for generating concatenated gene alignments for phylogenetic analysis. We have released cognac as an R package ( https://github.com/rdcrawford/cognac ) with customizable parameters for adaptation to diverse applications.</description><subject>Amino acids</subject><subject>Analysis</subject><subject>Bacteria</subject><subject>Bacteria - classification</subject><subject>Bacteria - genetics</subject><subject>Bacterial genetics</subject><subject>Biology</subject><subject>Concatenated gene tree</subject><subject>Core genome</subject><subject>Databases, Genetic</subject><subject>Datasets</subject><subject>Family Characteristics</subject><subject>Gene sequencing</subject><subject>Genes</subject><subject>Genome, Bacterial</subject><subject>Genomes</subject><subject>Multiple sequence alignment</subject><subject>Phylogenetics</subject><subject>Phylogeny</subject><subject>Prokaryotes</subject><subject>Software</subject><subject>Trees</subject><subject>Whole Genome Sequencing</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNptkl2L1DAUhoso7jr6B7yQgDcKdk2apk33QlgWPwYWBD-uw2ly0snQJmPSUfcv-KvNzKzrjkgoKTnPeUPe8xbFU0bPGJPN68QqKbqSVqykvJOsrO8Vp6xuWVkxKu7f-T8pHqW0ppS1koqHxQnnQspW8NPilw6DB31OImycIQN6jDC74EmwRAevYUafv0OJwOgGP6GfE7Ehks3qegy7wuw0cd5iRK-R2BgmMkIc8BXpQc8YHYzkxyqMuJMJE5KE37aZdX4gBmZIOKfHxQMLY8InN_ui-Pru7ZfLD-XVx_fLy4urUouunUursdFNj53ohbaCMuyobjVF3rTZDqh7bo20Fe8YYwKEybg1Rjeair6Fji-K5UHXBFirTXQTxGsVwKn9QYiDgpgfNKIylAlrWm6spbXhBnqUyDnrKlpVPN-xKN4ctDbbfkKjszMRxiPR44p3KzWE76qVUvBaZIEXNwIxZEfSrCaXNI4jeAzbpKq6o6yqJZMZff4Pug7b6LNVe6pjVR7qX2qA_IA8k5Dv1TtRddEI3tAsVmfq7D9UXgYnl8eO1uXzo4aXRw2ZmfHnPMA2JbX8_OmYrQ6sjiGliPbWD0bVLrnqkFyVk6v2yVW7pmd3nbxt-RNV_hs-_ur8</recordid><startdate>20210215</startdate><enddate>20210215</enddate><creator>Crawford, Ryan D</creator><creator>Snitkin, Evan S</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>3V.</scope><scope>7QO</scope><scope>7SC</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K7-</scope><scope>K9.</scope><scope>L7M</scope><scope>LK8</scope><scope>L~C</scope><scope>L~D</scope><scope>M0N</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>Q9U</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0001-8409-278X</orcidid></search><sort><creationdate>20210215</creationdate><title>cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets</title><author>Crawford, Ryan D ; Snitkin, Evan S</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c597t-fce6c6be95b5cf501e90c7c0e367859a4b3fd8f2391115a5de6cfddc6c05b7a93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Amino acids</topic><topic>Analysis</topic><topic>Bacteria</topic><topic>Bacteria - classification</topic><topic>Bacteria - genetics</topic><topic>Bacterial genetics</topic><topic>Biology</topic><topic>Concatenated gene tree</topic><topic>Core genome</topic><topic>Databases, Genetic</topic><topic>Datasets</topic><topic>Family Characteristics</topic><topic>Gene sequencing</topic><topic>Genes</topic><topic>Genome, Bacterial</topic><topic>Genomes</topic><topic>Multiple sequence alignment</topic><topic>Phylogenetics</topic><topic>Phylogeny</topic><topic>Prokaryotes</topic><topic>Software</topic><topic>Trees</topic><topic>Whole Genome Sequencing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Crawford, Ryan D</creatorcontrib><creatorcontrib>Snitkin, Evan S</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Biotechnology Research Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Health & Medical Collection (Proquest)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>Computer Science Database</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest Biological Science Collection</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Computing Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Biological Science Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Crawford, Ryan D</au><au>Snitkin, Evan S</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2021-02-15</date><risdate>2021</risdate><volume>22</volume><issue>1</issue><spage>70</spage><epage>70</epage><pages>70-70</pages><artnum>70</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>The quantity of genomic data is expanding at an increasing rate. Tools for phylogenetic analysis which scale to the quantity of available data are required. To address this need, we present cognac, a user-friendly software package to rapidly generate concatenated gene alignments for phylogenetic analysis.
We illustrate that cognac is able to rapidly identify phylogenetic marker genes using a data driven approach and efficiently generate concatenated gene alignments for very large genomic datasets. To benchmark our tool, we generated core gene alignments for eight unique genera of bacteria, including a dataset of over 11,000 genomes from the genus Escherichia producing an alignment with 1353 genes, which was constructed in less than 17 h.
We demonstrate that cognac presents an efficient method for generating concatenated gene alignments for phylogenetic analysis. We have released cognac as an R package ( https://github.com/rdcrawford/cognac ) with customizable parameters for adaptation to diverse applications.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>33588753</pmid><doi>10.1186/s12859-021-03981-4</doi><tpages>1</tpages><orcidid>https://orcid.org/0000-0001-8409-278X</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1471-2105 |
ispartof | BMC bioinformatics, 2021-02, Vol.22 (1), p.70-70, Article 70 |
issn | 1471-2105 1471-2105 |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_d015fd73dff04d3dabe8e331920223f2 |
source | Publicly Available Content (ProQuest); PubMed Central |
subjects | Amino acids Analysis Bacteria Bacteria - classification Bacteria - genetics Bacterial genetics Biology Concatenated gene tree Core genome Databases, Genetic Datasets Family Characteristics Gene sequencing Genes Genome, Bacterial Genomes Multiple sequence alignment Phylogenetics Phylogeny Prokaryotes Software Trees Whole Genome Sequencing |
title | cognac: rapid generation of concatenated gene alignments for phylogenetic inference from large, bacterial whole genome sequencing datasets |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T20%3A30%3A09IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=cognac:%20rapid%20generation%20of%20concatenated%20gene%20alignments%20for%20phylogenetic%20inference%20from%20large,%20bacterial%20whole%20genome%20sequencing%20datasets&rft.jtitle=BMC%20bioinformatics&rft.au=Crawford,%20Ryan%20D&rft.date=2021-02-15&rft.volume=22&rft.issue=1&rft.spage=70&rft.epage=70&rft.pages=70-70&rft.artnum=70&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/s12859-021-03981-4&rft_dat=%3Cgale_doaj_%3EA653602484%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c597t-fce6c6be95b5cf501e90c7c0e367859a4b3fd8f2391115a5de6cfddc6c05b7a93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2490912588&rft_id=info:pmid/33588753&rft_galeid=A653602484&rfr_iscdi=true |