Loading…
DySC: software for greedy clustering of 16S rRNA reads
Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering a...
Saved in:
Published in: | Bioinformatics 2012-08, Vol.28 (16), p.2182-2183 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c452t-ae06a9ef37e9e37f5bc1c101759694066b172db88a27bac22261c7247e2c1473 |
---|---|
cites | cdi_FETCH-LOGICAL-c452t-ae06a9ef37e9e37f5bc1c101759694066b172db88a27bac22261c7247e2c1473 |
container_end_page | 2183 |
container_issue | 16 |
container_start_page | 2182 |
container_title | Bioinformatics |
container_volume | 28 |
creator | ZEJUN ZHENG KRAMER, Stefan SCHMIDT, Bertil |
description | Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the normalized mutual information (NMI) criterion show that DySC produces higher quality clusters than UCLUST and CD-HIT at a comparable runtime.
DySC, implemented in C, is available at http://code.google.com/p/dysc/ under GNU GPL license. |
doi_str_mv | 10.1093/bioinformatics/bts355 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1671554707</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1032895780</sourcerecordid><originalsourceid>FETCH-LOGICAL-c452t-ae06a9ef37e9e37f5bc1c101759694066b172db88a27bac22261c7247e2c1473</originalsourceid><addsrcrecordid>eNqN0U1Lw0AQBuBFFFurP0HZi-AlurOfWW-lfkJRsL2HzWa3RNKk7iZI_70prRVPepo5PDMD8yJ0DuQaiGY3edmUtW_C0rSljTd5G5kQB2gITKqEpwCH-56wATqJ8Z0QIoiQx2hAqWKEMzFE8m49m9zi2Pj20wSH-414EZwr1thWXWxdKOsFbjwGOcPh7WWMgzNFPEVH3lTRne3qCM0f7ueTp2T6-vg8GU8TywVtE-OINNp5ppx2THmRW7BAQAktNSdS5qBokaepoSo3llIqwSrKlaMWuGIjdLVduwrNR-dimy3LaF1Vmdo1XcxAKhCCK_IPyhknlDFB_qaE0VQLlW6o2FIbmhiD89kqlEsT1j3KNjlkv3PItjn0cxe7E12-dMV-6vvxPbjcAROtqXwwtS3jj5OgBdXAvgCxMpKw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1032895780</pqid></control><display><type>article</type><title>DySC: software for greedy clustering of 16S rRNA reads</title><source>Oxford Open</source><source>PubMed Central</source><creator>ZEJUN ZHENG ; KRAMER, Stefan ; SCHMIDT, Bertil</creator><creatorcontrib>ZEJUN ZHENG ; KRAMER, Stefan ; SCHMIDT, Bertil</creatorcontrib><description>Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the normalized mutual information (NMI) criterion show that DySC produces higher quality clusters than UCLUST and CD-HIT at a comparable runtime.
DySC, implemented in C, is available at http://code.google.com/p/dysc/ under GNU GPL license.</description><identifier>ISSN: 1367-4803</identifier><identifier>EISSN: 1367-4811</identifier><identifier>EISSN: 1460-2059</identifier><identifier>DOI: 10.1093/bioinformatics/bts355</identifier><identifier>PMID: 22730435</identifier><language>eng</language><publisher>Oxford: Oxford University Press</publisher><subject>Bioinformatics ; Biological and medical sciences ; Cadmium ; Cluster Analysis ; Clustering ; Computer programs ; Dynamics ; Fundamental and applied biological sciences. Psychology ; General aspects ; Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) ; Metagenome ; Microorganisms ; RNA, Ribosomal, 16S - genetics ; Run time (computers) ; Sequence Analysis, RNA - methods ; Software</subject><ispartof>Bioinformatics, 2012-08, Vol.28 (16), p.2182-2183</ispartof><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c452t-ae06a9ef37e9e37f5bc1c101759694066b172db88a27bac22261c7247e2c1473</citedby><cites>FETCH-LOGICAL-c452t-ae06a9ef37e9e37f5bc1c101759694066b172db88a27bac22261c7247e2c1473</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=26195291$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22730435$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>ZEJUN ZHENG</creatorcontrib><creatorcontrib>KRAMER, Stefan</creatorcontrib><creatorcontrib>SCHMIDT, Bertil</creatorcontrib><title>DySC: software for greedy clustering of 16S rRNA reads</title><title>Bioinformatics</title><addtitle>Bioinformatics</addtitle><description>Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the normalized mutual information (NMI) criterion show that DySC produces higher quality clusters than UCLUST and CD-HIT at a comparable runtime.
DySC, implemented in C, is available at http://code.google.com/p/dysc/ under GNU GPL license.</description><subject>Bioinformatics</subject><subject>Biological and medical sciences</subject><subject>Cadmium</subject><subject>Cluster Analysis</subject><subject>Clustering</subject><subject>Computer programs</subject><subject>Dynamics</subject><subject>Fundamental and applied biological sciences. Psychology</subject><subject>General aspects</subject><subject>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</subject><subject>Metagenome</subject><subject>Microorganisms</subject><subject>RNA, Ribosomal, 16S - genetics</subject><subject>Run time (computers)</subject><subject>Sequence Analysis, RNA - methods</subject><subject>Software</subject><issn>1367-4803</issn><issn>1367-4811</issn><issn>1460-2059</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2012</creationdate><recordtype>article</recordtype><recordid>eNqN0U1Lw0AQBuBFFFurP0HZi-AlurOfWW-lfkJRsL2HzWa3RNKk7iZI_70prRVPepo5PDMD8yJ0DuQaiGY3edmUtW_C0rSljTd5G5kQB2gITKqEpwCH-56wATqJ8Z0QIoiQx2hAqWKEMzFE8m49m9zi2Pj20wSH-414EZwr1thWXWxdKOsFbjwGOcPh7WWMgzNFPEVH3lTRne3qCM0f7ueTp2T6-vg8GU8TywVtE-OINNp5ppx2THmRW7BAQAktNSdS5qBokaepoSo3llIqwSrKlaMWuGIjdLVduwrNR-dimy3LaF1Vmdo1XcxAKhCCK_IPyhknlDFB_qaE0VQLlW6o2FIbmhiD89kqlEsT1j3KNjlkv3PItjn0cxe7E12-dMV-6vvxPbjcAROtqXwwtS3jj5OgBdXAvgCxMpKw</recordid><startdate>20120815</startdate><enddate>20120815</enddate><creator>ZEJUN ZHENG</creator><creator>KRAMER, Stefan</creator><creator>SCHMIDT, Bertil</creator><general>Oxford University Press</general><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><scope>7QO</scope><scope>7TM</scope><scope>8FD</scope><scope>FR3</scope><scope>P64</scope><scope>7SC</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20120815</creationdate><title>DySC: software for greedy clustering of 16S rRNA reads</title><author>ZEJUN ZHENG ; KRAMER, Stefan ; SCHMIDT, Bertil</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c452t-ae06a9ef37e9e37f5bc1c101759694066b172db88a27bac22261c7247e2c1473</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Bioinformatics</topic><topic>Biological and medical sciences</topic><topic>Cadmium</topic><topic>Cluster Analysis</topic><topic>Clustering</topic><topic>Computer programs</topic><topic>Dynamics</topic><topic>Fundamental and applied biological sciences. Psychology</topic><topic>General aspects</topic><topic>Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects)</topic><topic>Metagenome</topic><topic>Microorganisms</topic><topic>RNA, Ribosomal, 16S - genetics</topic><topic>Run time (computers)</topic><topic>Sequence Analysis, RNA - methods</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>ZEJUN ZHENG</creatorcontrib><creatorcontrib>KRAMER, Stefan</creatorcontrib><creatorcontrib>SCHMIDT, Bertil</creatorcontrib><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><collection>Biotechnology Research Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>ZEJUN ZHENG</au><au>KRAMER, Stefan</au><au>SCHMIDT, Bertil</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>DySC: software for greedy clustering of 16S rRNA reads</atitle><jtitle>Bioinformatics</jtitle><addtitle>Bioinformatics</addtitle><date>2012-08-15</date><risdate>2012</risdate><volume>28</volume><issue>16</issue><spage>2182</spage><epage>2183</epage><pages>2182-2183</pages><issn>1367-4803</issn><eissn>1367-4811</eissn><eissn>1460-2059</eissn><abstract>Pyrosequencing technologies are frequently used for sequencing the 16S ribosomal RNA marker gene for profiling microbial communities. Clustering of the produced reads is an important but time-consuming task. We present Dynamic Seed-based Clustering (DySC), a new tool based on the greedy clustering approach that uses a dynamic seeding strategy. Evaluations based on the normalized mutual information (NMI) criterion show that DySC produces higher quality clusters than UCLUST and CD-HIT at a comparable runtime.
DySC, implemented in C, is available at http://code.google.com/p/dysc/ under GNU GPL license.</abstract><cop>Oxford</cop><pub>Oxford University Press</pub><pmid>22730435</pmid><doi>10.1093/bioinformatics/bts355</doi><tpages>2</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1367-4803 |
ispartof | Bioinformatics, 2012-08, Vol.28 (16), p.2182-2183 |
issn | 1367-4803 1367-4811 1460-2059 |
language | eng |
recordid | cdi_proquest_miscellaneous_1671554707 |
source | Oxford Open; PubMed Central |
subjects | Bioinformatics Biological and medical sciences Cadmium Cluster Analysis Clustering Computer programs Dynamics Fundamental and applied biological sciences. Psychology General aspects Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Metagenome Microorganisms RNA, Ribosomal, 16S - genetics Run time (computers) Sequence Analysis, RNA - methods Software |
title | DySC: software for greedy clustering of 16S rRNA reads |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T00%3A27%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=DySC:%20software%20for%20greedy%20clustering%20of%2016S%20rRNA%20reads&rft.jtitle=Bioinformatics&rft.au=ZEJUN%20ZHENG&rft.date=2012-08-15&rft.volume=28&rft.issue=16&rft.spage=2182&rft.epage=2183&rft.pages=2182-2183&rft.issn=1367-4803&rft.eissn=1367-4811&rft_id=info:doi/10.1093/bioinformatics/bts355&rft_dat=%3Cproquest_cross%3E1032895780%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c452t-ae06a9ef37e9e37f5bc1c101759694066b172db88a27bac22261c7247e2c1473%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1032895780&rft_id=info:pmid/22730435&rfr_iscdi=true |