Loading…

LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons

Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable ele...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics 2008-01, Vol.9 (1), p.18-18, Article 18
Main Authors: Ellinghaus, David, Kurtz, Stefan, Willhoeft, Ute
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-b710t-15e342cf26c64b189755676fc1567c2759766fcf8d158e77e944f9ded9912f583
cites cdi_FETCH-LOGICAL-b710t-15e342cf26c64b189755676fc1567c2759766fcf8d158e77e944f9ded9912f583
container_end_page 18
container_issue 1
container_start_page 18
container_title BMC bioinformatics
container_volume 9
creator Ellinghaus, David
Kurtz, Stefan
Willhoeft, Ute
description Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs). We have developed a software tool LTRharvest for the de novo detection of full length LTR retrotransposons in large sequence sets. LTRharvest efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of LTRharvest against a gold standard annotation for Saccharomyces cerevisae and Drosophila melanogaster shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of LTRharvest over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software. LTRharvest is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes LTRharvest a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.
doi_str_mv 10.1186/1471-2105-9-18
format article
fullrecord <record><control><sourceid>gale_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_9c800862c79f4539bf2586a624e3fcf9</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A175200803</galeid><doaj_id>oai_doaj_org_article_9c800862c79f4539bf2586a624e3fcf9</doaj_id><sourcerecordid>A175200803</sourcerecordid><originalsourceid>FETCH-LOGICAL-b710t-15e342cf26c64b189755676fc1567c2759766fcf8d158e77e944f9ded9912f583</originalsourceid><addsrcrecordid>eNqFkuFr1DAYxosobk6_-lEKgjCwM0mTJvkiHGPqwYEw5zchpOmbW0bbnEnunP-9qXfMFSfSD2_z5skvyfOkKF5idIaxaN5hynFFMGKVrLB4VBzfNR7f-z8qnsV4gxDmArGnxREWWFKG-XHxbXV1ea3DDmJ6W-qxBGudcTCmPOhK28Ota3soo7fphw5QWh_KDsrR73yuCUxyfiy9LTOnDJCCT0GPceOjH-Pz4onVfYQXh3pSfP1wcXX-qVp9_rg8X6yqlmOUKsygpsRY0piGtlhIzljDG2twLoZwJnmTR1Z0mAngHCSlVnbQSYmJZaI-KZZ7buf1jdoEN-jwU3nt1O-GD2ulQ3KmByWNQEg0xHBpKatlawkTjW4IhTpvITPr_Z612bYDdCZbEXQ_g85nRnet1n6nCGF1tjQDFntA6_w_APMZ4wc1JaWmpJRUeLrQm8Mhgv--zdmowUUDfa9H8NuoOKoJbQT9r5CgmkrJWBa-3gvXOrvgRjvlZCaxWmDOSDYF1Vl19oAqfx0MzvgRrMv92YLT2YKsSXCb1nobo1p-uXwQboKPMYC98wQjNT3mv114dT-KP_LD661_AVMw7Rk</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>20349955</pqid></control><display><type>article</type><title>LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons</title><source>PubMed (Medline)</source><creator>Ellinghaus, David ; Kurtz, Stefan ; Willhoeft, Ute</creator><creatorcontrib>Ellinghaus, David ; Kurtz, Stefan ; Willhoeft, Ute</creatorcontrib><description>Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs). We have developed a software tool LTRharvest for the de novo detection of full length LTR retrotransposons in large sequence sets. LTRharvest efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of LTRharvest against a gold standard annotation for Saccharomyces cerevisae and Drosophila melanogaster shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of LTRharvest over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software. LTRharvest is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes LTRharvest a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.</description><identifier>ISSN: 1471-2105</identifier><identifier>EISSN: 1471-2105</identifier><identifier>DOI: 10.1186/1471-2105-9-18</identifier><identifier>PMID: 18194517</identifier><language>eng</language><publisher>England: BioMed Central Ltd</publisher><subject>Algorithms ; Base Sequence ; Chromosome Mapping - methods ; Drosophila melanogaster ; Identification and classification ; Molecular Sequence Data ; Programming Languages ; Retroelements - genetics ; Retrotransposons ; Saccharomyces ; Sequence Alignment - methods ; Sequence Analysis, DNA - methods ; Software</subject><ispartof>BMC bioinformatics, 2008-01, Vol.9 (1), p.18-18, Article 18</ispartof><rights>COPYRIGHT 2008 BioMed Central Ltd.</rights><rights>Copyright © 2008 Ellinghaus et al; licensee BioMed Central Ltd. 2008 Ellinghaus et al; licensee BioMed Central Ltd.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-b710t-15e342cf26c64b189755676fc1567c2759766fcf8d158e77e944f9ded9912f583</citedby><cites>FETCH-LOGICAL-b710t-15e342cf26c64b189755676fc1567c2759766fcf8d158e77e944f9ded9912f583</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2253517/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC2253517/$$EHTML$$P50$$Gpubmedcentral$$Hfree_for_read</linktohtml><link.rule.ids>230,314,723,776,780,881,27903,27904,53769,53771</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/18194517$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Ellinghaus, David</creatorcontrib><creatorcontrib>Kurtz, Stefan</creatorcontrib><creatorcontrib>Willhoeft, Ute</creatorcontrib><title>LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons</title><title>BMC bioinformatics</title><addtitle>BMC Bioinformatics</addtitle><description>Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs). We have developed a software tool LTRharvest for the de novo detection of full length LTR retrotransposons in large sequence sets. LTRharvest efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of LTRharvest against a gold standard annotation for Saccharomyces cerevisae and Drosophila melanogaster shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of LTRharvest over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software. LTRharvest is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes LTRharvest a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.</description><subject>Algorithms</subject><subject>Base Sequence</subject><subject>Chromosome Mapping - methods</subject><subject>Drosophila melanogaster</subject><subject>Identification and classification</subject><subject>Molecular Sequence Data</subject><subject>Programming Languages</subject><subject>Retroelements - genetics</subject><subject>Retrotransposons</subject><subject>Saccharomyces</subject><subject>Sequence Alignment - methods</subject><subject>Sequence Analysis, DNA - methods</subject><subject>Software</subject><issn>1471-2105</issn><issn>1471-2105</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2008</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNqFkuFr1DAYxosobk6_-lEKgjCwM0mTJvkiHGPqwYEw5zchpOmbW0bbnEnunP-9qXfMFSfSD2_z5skvyfOkKF5idIaxaN5hynFFMGKVrLB4VBzfNR7f-z8qnsV4gxDmArGnxREWWFKG-XHxbXV1ea3DDmJ6W-qxBGudcTCmPOhK28Ota3soo7fphw5QWh_KDsrR73yuCUxyfiy9LTOnDJCCT0GPceOjH-Pz4onVfYQXh3pSfP1wcXX-qVp9_rg8X6yqlmOUKsygpsRY0piGtlhIzljDG2twLoZwJnmTR1Z0mAngHCSlVnbQSYmJZaI-KZZ7buf1jdoEN-jwU3nt1O-GD2ulQ3KmByWNQEg0xHBpKatlawkTjW4IhTpvITPr_Z612bYDdCZbEXQ_g85nRnet1n6nCGF1tjQDFntA6_w_APMZ4wc1JaWmpJRUeLrQm8Mhgv--zdmowUUDfa9H8NuoOKoJbQT9r5CgmkrJWBa-3gvXOrvgRjvlZCaxWmDOSDYF1Vl19oAqfx0MzvgRrMv92YLT2YKsSXCb1nobo1p-uXwQboKPMYC98wQjNT3mv114dT-KP_LD661_AVMw7Rk</recordid><startdate>20080114</startdate><enddate>20080114</enddate><creator>Ellinghaus, David</creator><creator>Kurtz, Stefan</creator><creator>Willhoeft, Ute</creator><general>BioMed Central Ltd</general><general>BioMed Central</general><general>BMC</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>ISR</scope><scope>7QO</scope><scope>7SS</scope><scope>8FD</scope><scope>FR3</scope><scope>M7N</scope><scope>P64</scope><scope>RC3</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20080114</creationdate><title>LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons</title><author>Ellinghaus, David ; Kurtz, Stefan ; Willhoeft, Ute</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-b710t-15e342cf26c64b189755676fc1567c2759766fcf8d158e77e944f9ded9912f583</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2008</creationdate><topic>Algorithms</topic><topic>Base Sequence</topic><topic>Chromosome Mapping - methods</topic><topic>Drosophila melanogaster</topic><topic>Identification and classification</topic><topic>Molecular Sequence Data</topic><topic>Programming Languages</topic><topic>Retroelements - genetics</topic><topic>Retrotransposons</topic><topic>Saccharomyces</topic><topic>Sequence Alignment - methods</topic><topic>Sequence Analysis, DNA - methods</topic><topic>Software</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ellinghaus, David</creatorcontrib><creatorcontrib>Kurtz, Stefan</creatorcontrib><creatorcontrib>Willhoeft, Ute</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>Gale In Context: Science</collection><collection>Biotechnology Research Abstracts</collection><collection>Entomology Abstracts (Full archive)</collection><collection>Technology Research Database</collection><collection>Engineering Research Database</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>Genetics Abstracts</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>BMC bioinformatics</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ellinghaus, David</au><au>Kurtz, Stefan</au><au>Willhoeft, Ute</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons</atitle><jtitle>BMC bioinformatics</jtitle><addtitle>BMC Bioinformatics</addtitle><date>2008-01-14</date><risdate>2008</risdate><volume>9</volume><issue>1</issue><spage>18</spage><epage>18</epage><pages>18-18</pages><artnum>18</artnum><issn>1471-2105</issn><eissn>1471-2105</eissn><abstract>Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs). We have developed a software tool LTRharvest for the de novo detection of full length LTR retrotransposons in large sequence sets. LTRharvest efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of LTRharvest against a gold standard annotation for Saccharomyces cerevisae and Drosophila melanogaster shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of LTRharvest over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software. LTRharvest is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes LTRharvest a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.</abstract><cop>England</cop><pub>BioMed Central Ltd</pub><pmid>18194517</pmid><doi>10.1186/1471-2105-9-18</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1471-2105
ispartof BMC bioinformatics, 2008-01, Vol.9 (1), p.18-18, Article 18
issn 1471-2105
1471-2105
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_9c800862c79f4539bf2586a624e3fcf9
source PubMed (Medline)
subjects Algorithms
Base Sequence
Chromosome Mapping - methods
Drosophila melanogaster
Identification and classification
Molecular Sequence Data
Programming Languages
Retroelements - genetics
Retrotransposons
Saccharomyces
Sequence Alignment - methods
Sequence Analysis, DNA - methods
Software
title LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-25T19%3A19%3A14IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=LTRharvest,%20an%20efficient%20and%20flexible%20software%20for%20de%20novo%20detection%20of%20LTR%20retrotransposons&rft.jtitle=BMC%20bioinformatics&rft.au=Ellinghaus,%20David&rft.date=2008-01-14&rft.volume=9&rft.issue=1&rft.spage=18&rft.epage=18&rft.pages=18-18&rft.artnum=18&rft.issn=1471-2105&rft.eissn=1471-2105&rft_id=info:doi/10.1186/1471-2105-9-18&rft_dat=%3Cgale_doaj_%3EA175200803%3C/gale_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-b710t-15e342cf26c64b189755676fc1567c2759766fcf8d158e77e944f9ded9912f583%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=20349955&rft_id=info:pmid/18194517&rft_galeid=A175200803&rfr_iscdi=true