Loading…
An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference
The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality a...
Saved in:
Published in: | PloS one 2011-10, Vol.6 (10), p.e26638-e26638 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053 |
---|---|
cites | |
container_end_page | e26638 |
container_issue | 10 |
container_start_page | e26638 |
container_title | PloS one |
container_volume | 6 |
creator | Agüero-Chapin, Guillermin Sánchez-Rodríguez, Aminael Hidalgo-Yanes, Pedro I Pérez-Castillo, Yunierkis Molina-Ruiz, Reinaldo Marchal, Kathleen Vasconcelos, Vítor Antunes, Agostinho |
description | The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate. |
doi_str_mv | 10.1371/journal.pone.0026638 |
format | article |
fullrecord | <record><control><sourceid>gale_plos_</sourceid><recordid>TN_cdi_plos_journals_1309934676</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A476867033</galeid><doaj_id>oai_doaj_org_article_d21d467703db4c5886500106cf499991</doaj_id><sourcerecordid>A476867033</sourcerecordid><originalsourceid>FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053</originalsourceid><addsrcrecordid>eNqNkl1r2zAYhc3YWLtu_2BshsHGLpLpw5Hsm0Eo-wgUCmvYrVCk1447RXIle6z_fm8at8SjF7MvbOTnPdI5Pln2mpI55ZJ-ug5D9NrNu-BhTggTgpdPslNacTYTjPCnR-8n2YuUrglZ8FKI59kJY6QQnJHTbL30uXZt43fg-1kdAXLddTFos83rEHMYful4G_rW5Kv1Fcu196HXfRtwzNu829660ICHPdD6GiJ4Ay-zZ7V2CV6Nz7Ns_fXL-vz77OLy2-p8eTEzoqL9jAv0IS2x5abQuhLUgqYEgBMKtJYl2A0FbhiemVLJ60UtpSaUGamLAr2cZW8Psp0LSY15JEU5qSpeCCmQWB0IG_S16mK7QzMq6FbdLYTYKB3x6A6UZdTijCTcbgqzKEuxIIQSYeqiwoui1udxt2GzA2swr6jdRHT6xbdb1YTfCnNmC1GhwIdRIIabAVKvdm0y4Jz2EIakKsJIKfmdsXf_kI-bG6lG4_kx_IDbmr2mWhZSlAK9cKTmj1B4W9i1BrtTt7g-Gfg4GUCmhz99o4eU1Orqx_-zlz-n7Psjdgva9dsU3LDvUpqCxQE0MaQUoX7ImBK1r_59GmpffTVWH8feHP-fh6H7rvO_qyf8qQ</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1309934676</pqid></control><display><type>article</type><title>An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference</title><source>Open Access: PubMed Central</source><source>ProQuest - Publicly Available Content Database</source><creator>Agüero-Chapin, Guillermin ; Sánchez-Rodríguez, Aminael ; Hidalgo-Yanes, Pedro I ; Pérez-Castillo, Yunierkis ; Molina-Ruiz, Reinaldo ; Marchal, Kathleen ; Vasconcelos, Vítor ; Antunes, Agostinho</creator><contributor>Badger, Jonathan H.</contributor><creatorcontrib>Agüero-Chapin, Guillermin ; Sánchez-Rodríguez, Aminael ; Hidalgo-Yanes, Pedro I ; Pérez-Castillo, Yunierkis ; Molina-Ruiz, Reinaldo ; Marchal, Kathleen ; Vasconcelos, Vítor ; Antunes, Agostinho ; Badger, Jonathan H.</creatorcontrib><description>The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate.</description><identifier>ISSN: 1932-6203</identifier><identifier>EISSN: 1932-6203</identifier><identifier>DOI: 10.1371/journal.pone.0026638</identifier><identifier>PMID: 22046320</identifier><language>eng</language><publisher>United States: Public Library of Science</publisher><subject>Accuracy ; Acids ; Algorithms ; Alignment ; Analysis ; Annotations ; Artificial neural networks ; Ascomycota ; Biology ; Biopolymers ; Chemistry ; Classification ; Cystic fibrosis ; Deoxyribonucleic acid ; Dictionaries ; Divergence ; DNA ; DNA, Ribosomal Spacer ; Eukaryota - genetics ; Eukaryotes ; Fungi ; Genetic algorithms ; Identification ; Inference ; Markov chains ; Methods ; Molecular Sequence Annotation ; Neural networks ; Neural Networks, Computer ; Next-generation sequencing ; Nucleic Acid Conformation ; Phylogenetics ; Phylogeny ; Protein structure ; Proteins ; Ribonucleic acid ; RNA ; RNA Folding ; Secondary structure ; Simulation ; Strings ; Taxonomy ; Test sets</subject><ispartof>PloS one, 2011-10, Vol.6 (10), p.e26638-e26638</ispartof><rights>COPYRIGHT 2011 Public Library of Science</rights><rights>2011 Agüero-Chapin et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License: https://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>Agüero-Chapin et al. 2011</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1309934676/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1309934676?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25753,27924,27925,37012,37013,44590,53791,53793,75126</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/22046320$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><contributor>Badger, Jonathan H.</contributor><creatorcontrib>Agüero-Chapin, Guillermin</creatorcontrib><creatorcontrib>Sánchez-Rodríguez, Aminael</creatorcontrib><creatorcontrib>Hidalgo-Yanes, Pedro I</creatorcontrib><creatorcontrib>Pérez-Castillo, Yunierkis</creatorcontrib><creatorcontrib>Molina-Ruiz, Reinaldo</creatorcontrib><creatorcontrib>Marchal, Kathleen</creatorcontrib><creatorcontrib>Vasconcelos, Vítor</creatorcontrib><creatorcontrib>Antunes, Agostinho</creatorcontrib><title>An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference</title><title>PloS one</title><addtitle>PLoS One</addtitle><description>The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate.</description><subject>Accuracy</subject><subject>Acids</subject><subject>Algorithms</subject><subject>Alignment</subject><subject>Analysis</subject><subject>Annotations</subject><subject>Artificial neural networks</subject><subject>Ascomycota</subject><subject>Biology</subject><subject>Biopolymers</subject><subject>Chemistry</subject><subject>Classification</subject><subject>Cystic fibrosis</subject><subject>Deoxyribonucleic acid</subject><subject>Dictionaries</subject><subject>Divergence</subject><subject>DNA</subject><subject>DNA, Ribosomal Spacer</subject><subject>Eukaryota - genetics</subject><subject>Eukaryotes</subject><subject>Fungi</subject><subject>Genetic algorithms</subject><subject>Identification</subject><subject>Inference</subject><subject>Markov chains</subject><subject>Methods</subject><subject>Molecular Sequence Annotation</subject><subject>Neural networks</subject><subject>Neural Networks, Computer</subject><subject>Next-generation sequencing</subject><subject>Nucleic Acid Conformation</subject><subject>Phylogenetics</subject><subject>Phylogeny</subject><subject>Protein structure</subject><subject>Proteins</subject><subject>Ribonucleic acid</subject><subject>RNA</subject><subject>RNA Folding</subject><subject>Secondary structure</subject><subject>Simulation</subject><subject>Strings</subject><subject>Taxonomy</subject><subject>Test sets</subject><issn>1932-6203</issn><issn>1932-6203</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2011</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNqNkl1r2zAYhc3YWLtu_2BshsHGLpLpw5Hsm0Eo-wgUCmvYrVCk1447RXIle6z_fm8at8SjF7MvbOTnPdI5Pln2mpI55ZJ-ug5D9NrNu-BhTggTgpdPslNacTYTjPCnR-8n2YuUrglZ8FKI59kJY6QQnJHTbL30uXZt43fg-1kdAXLddTFos83rEHMYful4G_rW5Kv1Fcu196HXfRtwzNu829660ICHPdD6GiJ4Ay-zZ7V2CV6Nz7Ns_fXL-vz77OLy2-p8eTEzoqL9jAv0IS2x5abQuhLUgqYEgBMKtJYl2A0FbhiemVLJ60UtpSaUGamLAr2cZW8Psp0LSY15JEU5qSpeCCmQWB0IG_S16mK7QzMq6FbdLYTYKB3x6A6UZdTijCTcbgqzKEuxIIQSYeqiwoui1udxt2GzA2swr6jdRHT6xbdb1YTfCnNmC1GhwIdRIIabAVKvdm0y4Jz2EIakKsJIKfmdsXf_kI-bG6lG4_kx_IDbmr2mWhZSlAK9cKTmj1B4W9i1BrtTt7g-Gfg4GUCmhz99o4eU1Orqx_-zlz-n7Psjdgva9dsU3LDvUpqCxQE0MaQUoX7ImBK1r_59GmpffTVWH8feHP-fh6H7rvO_qyf8qQ</recordid><startdate>20111026</startdate><enddate>20111026</enddate><creator>Agüero-Chapin, Guillermin</creator><creator>Sánchez-Rodríguez, Aminael</creator><creator>Hidalgo-Yanes, Pedro I</creator><creator>Pérez-Castillo, Yunierkis</creator><creator>Molina-Ruiz, Reinaldo</creator><creator>Marchal, Kathleen</creator><creator>Vasconcelos, Vítor</creator><creator>Antunes, Agostinho</creator><general>Public Library of Science</general><general>Public Library of Science (PLoS)</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>IOV</scope><scope>ISR</scope><scope>3V.</scope><scope>7QG</scope><scope>7QL</scope><scope>7QO</scope><scope>7RV</scope><scope>7SN</scope><scope>7SS</scope><scope>7T5</scope><scope>7TG</scope><scope>7TM</scope><scope>7U9</scope><scope>7X2</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>C1K</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>H94</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>KB.</scope><scope>KB0</scope><scope>KL.</scope><scope>L6V</scope><scope>LK8</scope><scope>M0K</scope><scope>M0S</scope><scope>M1P</scope><scope>M7N</scope><scope>M7P</scope><scope>M7S</scope><scope>NAPCQ</scope><scope>P5Z</scope><scope>P62</scope><scope>P64</scope><scope>PATMY</scope><scope>PDBOC</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20111026</creationdate><title>An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference</title><author>Agüero-Chapin, Guillermin ; Sánchez-Rodríguez, Aminael ; Hidalgo-Yanes, Pedro I ; Pérez-Castillo, Yunierkis ; Molina-Ruiz, Reinaldo ; Marchal, Kathleen ; Vasconcelos, Vítor ; Antunes, Agostinho</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2011</creationdate><topic>Accuracy</topic><topic>Acids</topic><topic>Algorithms</topic><topic>Alignment</topic><topic>Analysis</topic><topic>Annotations</topic><topic>Artificial neural networks</topic><topic>Ascomycota</topic><topic>Biology</topic><topic>Biopolymers</topic><topic>Chemistry</topic><topic>Classification</topic><topic>Cystic fibrosis</topic><topic>Deoxyribonucleic acid</topic><topic>Dictionaries</topic><topic>Divergence</topic><topic>DNA</topic><topic>DNA, Ribosomal Spacer</topic><topic>Eukaryota - genetics</topic><topic>Eukaryotes</topic><topic>Fungi</topic><topic>Genetic algorithms</topic><topic>Identification</topic><topic>Inference</topic><topic>Markov chains</topic><topic>Methods</topic><topic>Molecular Sequence Annotation</topic><topic>Neural networks</topic><topic>Neural Networks, Computer</topic><topic>Next-generation sequencing</topic><topic>Nucleic Acid Conformation</topic><topic>Phylogenetics</topic><topic>Phylogeny</topic><topic>Protein structure</topic><topic>Proteins</topic><topic>Ribonucleic acid</topic><topic>RNA</topic><topic>RNA Folding</topic><topic>Secondary structure</topic><topic>Simulation</topic><topic>Strings</topic><topic>Taxonomy</topic><topic>Test sets</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Agüero-Chapin, Guillermin</creatorcontrib><creatorcontrib>Sánchez-Rodríguez, Aminael</creatorcontrib><creatorcontrib>Hidalgo-Yanes, Pedro I</creatorcontrib><creatorcontrib>Pérez-Castillo, Yunierkis</creatorcontrib><creatorcontrib>Molina-Ruiz, Reinaldo</creatorcontrib><creatorcontrib>Marchal, Kathleen</creatorcontrib><creatorcontrib>Vasconcelos, Vítor</creatorcontrib><creatorcontrib>Antunes, Agostinho</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Opposing Viewpoints</collection><collection>Gale In Context: Science</collection><collection>ProQuest Central (Corporate)</collection><collection>Animal Behavior Abstracts</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Biotechnology Research Abstracts</collection><collection>Nursing & Allied Health Database</collection><collection>Ecology Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Immunology Abstracts</collection><collection>Meteorological & Geoastrophysical Abstracts</collection><collection>Nucleic Acids Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>Agricultural Science Collection</collection><collection>ProQuest_Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>Materials Science Database</collection><collection>Nursing & Allied Health Database (Alumni Edition)</collection><collection>Meteorological & Geoastrophysical Abstracts - Academic</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Biological Science Collection</collection><collection>Agriculture Science Database</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>Medical Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>ProQuest Biological Science Journals</collection><collection>Engineering Database</collection><collection>Nursing & Allied Health Premium</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Environmental Science Database</collection><collection>Materials science collection</collection><collection>ProQuest - Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><collection>Environmental Science Collection</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>PloS one</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Agüero-Chapin, Guillermin</au><au>Sánchez-Rodríguez, Aminael</au><au>Hidalgo-Yanes, Pedro I</au><au>Pérez-Castillo, Yunierkis</au><au>Molina-Ruiz, Reinaldo</au><au>Marchal, Kathleen</au><au>Vasconcelos, Vítor</au><au>Antunes, Agostinho</au><au>Badger, Jonathan H.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference</atitle><jtitle>PloS one</jtitle><addtitle>PLoS One</addtitle><date>2011-10-26</date><risdate>2011</risdate><volume>6</volume><issue>10</issue><spage>e26638</spage><epage>e26638</epage><pages>e26638-e26638</pages><issn>1932-6203</issn><eissn>1932-6203</eissn><abstract>The ITS2 gene class shows a high sequence divergence among its members that have complicated its annotation and its use for reconstructing phylogenies at a higher taxonomical level (beyond species and genus). Several alignment strategies have been implemented to improve the ITS2 annotation quality and its use for phylogenetic inferences. Although, alignment based methods have been exploited to the top of its complexity to tackle both issues, no alignment-free approaches have been able to successfully address both topics. By contrast, the use of simple alignment-free classifiers, like the topological indices (TIs) containing information about the sequence and structure of ITS2, may reveal to be a useful approach for the gene prediction and for assessing the phylogenetic relationships of the ITS2 class in eukaryotes. Thus, we used the TI2BioP (Topological Indices to BioPolymers) methodology [1], [2], freely available at http://ti2biop.sourceforge.net/ to calculate two different TIs. One class was derived from the ITS2 artificial 2D structures generated from DNA strings and the other from the secondary structure inferred from RNA folding algorithms. Two alignment-free models based on Artificial Neural Networks were developed for the ITS2 class prediction using the two classes of TIs referred above. Both models showed similar performances on the training and the test sets reaching values above 95% in the overall classification. Due to the importance of the ITS2 region for fungi identification, a novel ITS2 genomic sequence was isolated from Petrakia sp. This sequence and the test set were used to comparatively evaluate the conventional classification models based on multiple sequence alignments like Hidden Markov based approaches, revealing the success of our models to identify novel ITS2 members. The isolated sequence was assessed using traditional and alignment-free based techniques applied to phylogenetic inference to complement the taxonomy of the Petrakia sp. fungal isolate.</abstract><cop>United States</cop><pub>Public Library of Science</pub><pmid>22046320</pmid><doi>10.1371/journal.pone.0026638</doi><tpages>e26638</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1932-6203 |
ispartof | PloS one, 2011-10, Vol.6 (10), p.e26638-e26638 |
issn | 1932-6203 1932-6203 |
language | eng |
recordid | cdi_plos_journals_1309934676 |
source | Open Access: PubMed Central; ProQuest - Publicly Available Content Database |
subjects | Accuracy Acids Algorithms Alignment Analysis Annotations Artificial neural networks Ascomycota Biology Biopolymers Chemistry Classification Cystic fibrosis Deoxyribonucleic acid Dictionaries Divergence DNA DNA, Ribosomal Spacer Eukaryota - genetics Eukaryotes Fungi Genetic algorithms Identification Inference Markov chains Methods Molecular Sequence Annotation Neural networks Neural Networks, Computer Next-generation sequencing Nucleic Acid Conformation Phylogenetics Phylogeny Protein structure Proteins Ribonucleic acid RNA RNA Folding Secondary structure Simulation Strings Taxonomy Test sets |
title | An alignment-free approach for eukaryotic ITS2 annotation and phylogenetic inference |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T02%3A50%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_plos_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=An%20alignment-free%20approach%20for%20eukaryotic%20ITS2%20annotation%20and%20phylogenetic%20inference&rft.jtitle=PloS%20one&rft.au=Ag%C3%BCero-Chapin,%20Guillermin&rft.date=2011-10-26&rft.volume=6&rft.issue=10&rft.spage=e26638&rft.epage=e26638&rft.pages=e26638-e26638&rft.issn=1932-6203&rft.eissn=1932-6203&rft_id=info:doi/10.1371/journal.pone.0026638&rft_dat=%3Cgale_plos_%3EA476867033%3C/gale_plos_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c691t-363717d0d8b4aa961dea10ee301e1f78edb1e3c28661173f5f77a012c7a44053%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1309934676&rft_id=info:pmid/22046320&rft_galeid=A476867033&rfr_iscdi=true |