Loading…

An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference

The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality a...

Full description

Saved in:
Bibliographic Details
Published in:PloS one 2011-10, Vol.6 (10), p.e26638-e26638
Main Authors: Agüero-Chapin, Guillermin, Sánchez-Rodríguez, Aminael, Hidalgo-Yanes, Pedro I, Pérez-Castillo, Yunierkis, Molina-Ruiz, Reinaldo, Marchal, Kathleen, Vasconcelos, Vítor, Antunes, Agostinho
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053
cites
container_end_page e26638
container_issue 10
container_start_page e26638
container_title PloS one
container_volume 6
creator Agüero-Chapin, Guillermin
Sánchez-Rodríguez, Aminael
Hidalgo-Yanes, Pedro I
Pérez-Castillo, Yunierkis
Molina-Ruiz, Reinaldo
Marchal, Kathleen
Vasconcelos, Vítor
Antunes, Agostinho
description The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate.
doi_str_mv 10.1371/journal.pone.0026638
format article
fullrecord <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_1309934676</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A476867033</galeid><doaj_id>oai_doaj_org_article_d21d467703db4c5886500106cf499991</doaj_id><sourcerecordid>A476867033</sourcerecordid><originalsourceid>FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053</originalsourceid><addsrcrecordid>eNqNkl1r2zAYhc3YWLtu_2BshsHGLpLpw5Hsm0Eo-wgUCmvYrVCk1447RXIle6z_fm8at8SjF7MvbOTnPdI5Pln2mpI55ZJ-ug5D9NrNu-BhTggTgpdPslNacTYTjPCnR-8n2YuUrglZ8FKI59kJY6QQnJHTbL30uXZt43fg-1kdAXLddTFos83rEHMYful4G_rW5Kv1Fcu196HXfRtwzNu829660ICHPdD6GiJ4Ay-zZ7V2CV6Nz7Ns_fXL-vz77OLy2-p8eTEzoqL9jAv0IS2x5abQuhLUgqYEgBMKtJYl2A0FbhiemVLJ60UtpSaUGamLAr2cZW8Psp0LSY15JEU5qSpeCCmQWB0IG_S16mK7QzMq6FbdLYTYKB3x6A6UZdTijCTcbgqzKEuxIIQSYeqiwoui1udxt2GzA2swr6jdRHT6xbdb1YTfCnNmC1GhwIdRIIabAVKvdm0y4Jz2EIakKsJIKfmdsXf_kI-bG6lG4_kx_IDbmr2mWhZSlAK9cKTmj1B4W9i1BrtTt7g-Gfg4GUCmhz99o4eU1Orqx_-zlz-n7Psjdgva9dsU3LDvUpqCxQE0MaQUoX7ImBK1r_59GmpffTVWH8feHP-fh6H7rvO_qyf8qQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1309934676</pqid></control><display><type>article</type><title>An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference</title><source>Open Access: PubMed Central</source><source>ProQuest - Publicly Available Content Database</source><creator>Agüero-Chapin, Guillermin ; Sánchez-Rodríguez, Aminael ; Hidalgo-Yanes, Pedro I ; Pérez-Castillo, Yunierkis ; Molina-Ruiz, Reinaldo ; Marchal, Kathleen ; Vasconcelos, Vítor ; Antunes, Agostinho</creator><contributor>Badger, Jonathan H.</contributor><creatorcontrib>Agüero-Chapin, Guillermin ; Sánchez-Rodríguez, Aminael ; Hidalgo-Yanes, Pedro I ; Pérez-Castillo, Yunierkis ; Molina-Ruiz, Reinaldo ; Marchal, Kathleen ; Vasconcelos, Vítor ; Antunes, Agostinho ; Badger, Jonathan H.</creatorcontrib><description>The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate.</description><identifier>ISSN: 1932-6203</identifier><identifier>EISSN: 1932-6203</identifier><identifier>DOI: 10.1371/journal.pone.0026638</identifier><identifier>PMID: 22046320</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Accuracy ; Acids ; Algorithms ; Alignment ; Analysis ; Annotations ; Artificial neural networks ; Ascomycota ; Biology ; Biopolymers ; Chemistry ; Classification ; Cystic fibrosis ; Deoxyribonucleic acid ; Dictionaries ; Divergence ; DNA ; DNA, Ribosomal Spacer ; Eukaryota - genetics ; Eukaryotes ; Fungi ; Genetic algorithms ; Identification ; Inference ; Markov chains ; Methods ; Molecular Sequence Annotation ; Neural networks ; Neural Networks, Computer ; Next-generation sequencing ; Nucleic Acid Conformation ; Phylogenetics ; Phylogeny ; Protein structure ; Proteins ; Ribonucleic acid ; RNA ; RNA Folding ; Secondary structure ; Simulation ; Strings ; Taxonomy ; Test sets</subject><ispartof>PloS one, 2011-10, Vol.6 (10), p.e26638-e26638</ispartof><rights>COPYRIGHT 2011 Public Library of Science</rights><rights>2011 Agüero-Chapin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>Agüero-Chapin et al. 2011</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1309934676/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1309934676?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793,75126</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22046320$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Badger, Jonathan H.</contributor><creatorcontrib>Agüero-Chapin, Guillermin</creatorcontrib><creatorcontrib>Sánchez-Rodríguez, Aminael</creatorcontrib><creatorcontrib>Hidalgo-Yanes, Pedro I</creatorcontrib><creatorcontrib>Pérez-Castillo, Yunierkis</creatorcontrib><creatorcontrib>Molina-Ruiz, Reinaldo</creatorcontrib><creatorcontrib>Marchal, Kathleen</creatorcontrib><creatorcontrib>Vasconcelos, Vítor</creatorcontrib><creatorcontrib>Antunes, Agostinho</creatorcontrib><title>An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference</title><title>PloS one</title><addtitle>PLoS One</addtitle><description>The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate.</description><subject>Accuracy</subject><subject>Acids</subject><subject>Algorithms</subject><subject>Alignment</subject><subject>Analysis</subject><subject>Annotations</subject><subject>Artificial neural networks</subject><subject>Ascomycota</subject><subject>Biology</subject><subject>Biopolymers</subject><subject>Chemistry</subject><subject>Classification</subject><subject>Cystic fibrosis</subject><subject>Deoxyribonucleic acid</subject><subject>Dictionaries</subject><subject>Divergence</subject><subject>DNA</subject><subject>DNA, Ribosomal Spacer</subject><subject>Eukaryota - genetics</subject><subject>Eukaryotes</subject><subject>Fungi</subject><subject>Genetic algorithms</subject><subject>Identification</subject><subject>Inference</subject><subject>Markov chains</subject><subject>Methods</subject><subject>Molecular Sequence Annotation</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Next-generation sequencing</subject><subject>Nucleic Acid Conformation</subject><subject>Phylogenetics</subject><subject>Phylogeny</subject><subject>Protein structure</subject><subject>Proteins</subject><subject>Ribonucleic acid</subject><subject>RNA</subject><subject>RNA Folding</subject><subject>Secondary structure</subject><subject>Simulation</subject><subject>Strings</subject><subject>Taxonomy</subject><subject>Test sets</subject><issn>1932-6203</issn><issn>1932-6203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNqNkl1r2zAYhc3YWLtu_2BshsHGLpLpw5Hsm0Eo-wgUCmvYrVCk1447RXIle6z_fm8at8SjF7MvbOTnPdI5Pln2mpI55ZJ-ug5D9NrNu-BhTggTgpdPslNacTYTjPCnR-8n2YuUrglZ8FKI59kJY6QQnJHTbL30uXZt43fg-1kdAXLddTFos83rEHMYful4G_rW5Kv1Fcu196HXfRtwzNu829660ICHPdD6GiJ4Ay-zZ7V2CV6Nz7Ns_fXL-vz77OLy2-p8eTEzoqL9jAv0IS2x5abQuhLUgqYEgBMKtJYl2A0FbhiemVLJ60UtpSaUGamLAr2cZW8Psp0LSY15JEU5qSpeCCmQWB0IG_S16mK7QzMq6FbdLYTYKB3x6A6UZdTijCTcbgqzKEuxIIQSYeqiwoui1udxt2GzA2swr6jdRHT6xbdb1YTfCnNmC1GhwIdRIIabAVKvdm0y4Jz2EIakKsJIKfmdsXf_kI-bG6lG4_kx_IDbmr2mWhZSlAK9cKTmj1B4W9i1BrtTt7g-Gfg4GUCmhz99o4eU1Orqx_-zlz-n7Psjdgva9dsU3LDvUpqCxQE0MaQUoX7ImBK1r_59GmpffTVWH8feHP-fh6H7rvO_qyf8qQ</recordid><startdate>20111026</startdate><enddate>20111026</enddate><creator>Agüero-Chapin, Guillermin</creator><creator>Sánchez-Rodríguez, Aminael</creator><creator>Hidalgo-Yanes, Pedro I</creator><creator>Pérez-Castillo, Yunierkis</creator><creator>Molina-Ruiz, Reinaldo</creator><creator>Marchal, Kathleen</creator><creator>Vasconcelos, Vítor</creator><creator>Antunes, Agostinho</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QG</scope><scope>7QL</scope><scope>7QO</scope><scope>7RV</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TG</scope><scope>7TM</scope><scope>7U9</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>KB.</scope><scope>KB0</scope><scope>KL.</scope><scope>L6V</scope><scope>LK8</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>M7N</scope><scope>M7P</scope><scope>M7S</scope><scope>NAPCQ</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PATMY</scope><scope>PDBOC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20111026</creationdate><title>An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference</title><author>Agüero-Chapin, Guillermin ; Sánchez-Rodríguez, Aminael ; Hidalgo-Yanes, Pedro I ; Pérez-Castillo, Yunierkis ; Molina-Ruiz, Reinaldo ; Marchal, Kathleen ; Vasconcelos, Vítor ; Antunes, Agostinho</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Accuracy</topic><topic>Acids</topic><topic>Algorithms</topic><topic>Alignment</topic><topic>Analysis</topic><topic>Annotations</topic><topic>Artificial neural networks</topic><topic>Ascomycota</topic><topic>Biology</topic><topic>Biopolymers</topic><topic>Chemistry</topic><topic>Classification</topic><topic>Cystic fibrosis</topic><topic>Deoxyribonucleic acid</topic><topic>Dictionaries</topic><topic>Divergence</topic><topic>DNA</topic><topic>DNA, Ribosomal Spacer</topic><topic>Eukaryota - genetics</topic><topic>Eukaryotes</topic><topic>Fungi</topic><topic>Genetic algorithms</topic><topic>Identification</topic><topic>Inference</topic><topic>Markov chains</topic><topic>Methods</topic><topic>Molecular Sequence Annotation</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Next-generation sequencing</topic><topic>Nucleic Acid Conformation</topic><topic>Phylogenetics</topic><topic>Phylogeny</topic><topic>Protein structure</topic><topic>Proteins</topic><topic>Ribonucleic acid</topic><topic>RNA</topic><topic>RNA Folding</topic><topic>Secondary structure</topic><topic>Simulation</topic><topic>Strings</topic><topic>Taxonomy</topic><topic>Test sets</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Agüero-Chapin, Guillermin</creatorcontrib><creatorcontrib>Sánchez-Rodríguez, Aminael</creatorcontrib><creatorcontrib>Hidalgo-Yanes, Pedro I</creatorcontrib><creatorcontrib>Pérez-Castillo, Yunierkis</creatorcontrib><creatorcontrib>Molina-Ruiz, Reinaldo</creatorcontrib><creatorcontrib>Marchal, Kathleen</creatorcontrib><creatorcontrib>Vasconcelos, Vítor</creatorcontrib><creatorcontrib>Antunes, Agostinho</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Opposing Viewpoints</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Nursing &amp; Allied Health Database</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Meteorological &amp; Geoastrophysical Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Agricultural Science Collection</collection><collection>ProQuest_Health &amp; Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>Agricultural &amp; Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>Nursing &amp; Allied Health Database (Alumni Edition)</collection><collection>Meteorological &amp; Geoastrophysical Abstracts - Academic</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Agriculture Science Database</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>ProQuest Biological Science Journals</collection><collection>Engineering Database</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Materials science collection</collection><collection>ProQuest - Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>Environmental Science Collection</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PloS one</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Agüero-Chapin, Guillermin</au><au>Sánchez-Rodríguez, Aminael</au><au>Hidalgo-Yanes, Pedro I</au><au>Pérez-Castillo, Yunierkis</au><au>Molina-Ruiz, Reinaldo</au><au>Marchal, Kathleen</au><au>Vasconcelos, Vítor</au><au>Antunes, Agostinho</au><au>Badger, Jonathan H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference</atitle><jtitle>PloS one</jtitle><addtitle>PLoS One</addtitle><date>2011-10-26</date><risdate>2011</risdate><volume>6</volume><issue>10</issue><spage>e26638</spage><epage>e26638</epage><pages>e26638-e26638</pages><issn>1932-6203</issn><eissn>1932-6203</eissn><abstract>The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>22046320</pmid><doi>10.1371/journal.pone.0026638</doi><tpages>e26638</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1932-6203
ispartof PloS one, 2011-10, Vol.6 (10), p.e26638-e26638
issn 1932-6203
1932-6203
language eng
recordid cdi_plos_journals_1309934676
source Open Access: PubMed Central; ProQuest - Publicly Available Content Database
subjects Accuracy
Acids
Algorithms
Alignment
Analysis
Annotations
Artificial neural networks
Ascomycota
Biology
Biopolymers
Chemistry
Classification
Cystic fibrosis
Deoxyribonucleic acid
Dictionaries
Divergence
DNA
DNA, Ribosomal Spacer
Eukaryota - genetics
Eukaryotes
Fungi
Genetic algorithms
Identification
Inference
Markov chains
Methods
Molecular Sequence Annotation
Neural networks
Neural Networks, Computer
Next-generation sequencing
Nucleic Acid Conformation
Phylogenetics
Phylogeny
Protein structure
Proteins
Ribonucleic acid
RNA
RNA Folding
Secondary structure
Simulation
Strings
Taxonomy
Test sets
title An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T02%3A50%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20alignment-free%20approach%20for%20eukaryotic%20ITS2%20annotation%20and%20phylogenetic%20inference&rft.jtitle=PloS%20one&rft.au=Ag%C3%BCero-Chapin,%20Guillermin&rft.date=2011-10-26&rft.volume=6&rft.issue=10&rft.spage=e26638&rft.epage=e26638&rft.pages=e26638-e26638&rft.issn=1932-6203&rft.eissn=1932-6203&rft_id=info:doi/10.1371/journal.pone.0026638&rft_dat=%3Cgale_plos_%3EA476867033%3C/gale_plos_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1309934676&rft_id=info:pmid/22046320&rft_galeid=A476867033&rfr_iscdi=true