Loading…
Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly
A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low...
Saved in:
Published in: | Genome Biology 2021-01, Vol.22 (1), p.28-28, Article 28 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c631t-b9467a631868ebe553494f337386bd5fd4c251a704ac76111497b430c50f3423 |
---|---|
cites | cdi_FETCH-LOGICAL-c631t-b9467a631868ebe553494f337386bd5fd4c251a704ac76111497b430c50f3423 |
container_end_page | 28 |
container_issue | 1 |
container_start_page | 28 |
container_title | Genome Biology |
container_volume | 22 |
creator | Holley, Guillaume Beyter, Doruk Ingimundardottir, Helga Møller, Peter L Kristmundsdottir, Snædis Eggertsson, Hannes P Halldorsson, Bjarni V |
description | A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly. |
doi_str_mv | 10.1186/s13059-020-02244-4 |
format | article |
fullrecord | <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_ae95bd781a544d83a00154b7906d8e9c</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><doaj_id>oai_doaj_org_article_ae95bd781a544d83a00154b7906d8e9c</doaj_id><sourcerecordid>2476561491</sourcerecordid><originalsourceid>FETCH-LOGICAL-c631t-b9467a631868ebe553494f337386bd5fd4c251a704ac76111497b430c50f3423</originalsourceid><addsrcrecordid>eNpdkstrFTEUxgdR7EP_ARcScONmajI5eYwLoRT7gIIgXbiRcPKY27nmTmoyU7j_fdPeWloXIR_Jd37knHxN84HRI8a0_FIYp6JvaUfr6gBaeNXsM1DQKkl_vX6m95qDUtaUsh46-bbZ4xyqVHy_-f0TZ5xT-fOVXG9tHj0JOadMXMo5uHlME0kDiWlakRzQFxImtDEUgs4tGedAbjGPOM3EYYxjteHkCZYSNjZu3zVvBowlvH_cD5ur0-9XJ-ft5Y-zi5Pjy9ZJzubW9iAVVqmlDjYIwaGHgXPFtbReDB5cJxgqCuiUZIxBryxw6gQdOHT8sLnYYX3CtbnJ4wbz1iQczcNByiuDeR5dDAZDL6xXmqEA8JpjHYoAq3oqvQ69q6xvO9bNYjfBuzDNGeML6Mubabw2q3RrlOo7SnUFfH4E5PR3CWU2m7G4ECNOIS3FdKCkkLUFVq2f_rOu05KnOql7l9aMKqGqq9u5XE6l5DA8PYZRcx8EswuCqUEwD0EwUIs-Pm_jqeTfz_M7JMWuEA</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2478810757</pqid></control><display><type>article</type><title>Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly</title><source>Publicly Available Content (ProQuest)</source><source>PubMed Central</source><creator>Holley, Guillaume ; Beyter, Doruk ; Ingimundardottir, Helga ; Møller, Peter L ; Kristmundsdottir, Snædis ; Eggertsson, Hannes P ; Halldorsson, Bjarni V</creator><creatorcontrib>Holley, Guillaume ; Beyter, Doruk ; Ingimundardottir, Helga ; Møller, Peter L ; Kristmundsdottir, Snædis ; Eggertsson, Hannes P ; Halldorsson, Bjarni V</creatorcontrib><description>A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.</description><identifier>ISSN: 1474-760X</identifier><identifier>ISSN: 1474-7596</identifier><identifier>EISSN: 1474-760X</identifier><identifier>DOI: 10.1186/s13059-020-02244-4</identifier><identifier>PMID: 33419473</identifier><language>eng</language><publisher>England: BioMed Central</publisher><subject>Error correction & detection ; Genomes ; Genomics ; Haplotypes ; Method</subject><ispartof>Genome Biology, 2021-01, Vol.22 (1), p.28-28, Article 28</ispartof><rights>2021. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><rights>The Author(s) 2021</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c631t-b9467a631868ebe553494f337386bd5fd4c251a704ac76111497b430c50f3423</citedby><cites>FETCH-LOGICAL-c631t-b9467a631868ebe553494f337386bd5fd4c251a704ac76111497b430c50f3423</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC7792008/pdf/$$EPDF$$P50$$Gpubmedcentral$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2478810757?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>230,314,727,780,784,885,25752,27923,27924,37011,37012,44589,53790,53792</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33419473$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Holley, Guillaume</creatorcontrib><creatorcontrib>Beyter, Doruk</creatorcontrib><creatorcontrib>Ingimundardottir, Helga</creatorcontrib><creatorcontrib>Møller, Peter L</creatorcontrib><creatorcontrib>Kristmundsdottir, Snædis</creatorcontrib><creatorcontrib>Eggertsson, Hannes P</creatorcontrib><creatorcontrib>Halldorsson, Bjarni V</creatorcontrib><title>Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly</title><title>Genome Biology</title><addtitle>Genome Biol</addtitle><description>A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.</description><subject>Error correction & detection</subject><subject>Genomes</subject><subject>Genomics</subject><subject>Haplotypes</subject><subject>Method</subject><issn>1474-760X</issn><issn>1474-7596</issn><issn>1474-760X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><sourceid>DOA</sourceid><recordid>eNpdkstrFTEUxgdR7EP_ARcScONmajI5eYwLoRT7gIIgXbiRcPKY27nmTmoyU7j_fdPeWloXIR_Jd37knHxN84HRI8a0_FIYp6JvaUfr6gBaeNXsM1DQKkl_vX6m95qDUtaUsh46-bbZ4xyqVHy_-f0TZ5xT-fOVXG9tHj0JOadMXMo5uHlME0kDiWlakRzQFxImtDEUgs4tGedAbjGPOM3EYYxjteHkCZYSNjZu3zVvBowlvH_cD5ur0-9XJ-ft5Y-zi5Pjy9ZJzubW9iAVVqmlDjYIwaGHgXPFtbReDB5cJxgqCuiUZIxBryxw6gQdOHT8sLnYYX3CtbnJ4wbz1iQczcNByiuDeR5dDAZDL6xXmqEA8JpjHYoAq3oqvQ69q6xvO9bNYjfBuzDNGeML6Mubabw2q3RrlOo7SnUFfH4E5PR3CWU2m7G4ECNOIS3FdKCkkLUFVq2f_rOu05KnOql7l9aMKqGqq9u5XE6l5DA8PYZRcx8EswuCqUEwD0EwUIs-Pm_jqeTfz_M7JMWuEA</recordid><startdate>20210108</startdate><enddate>20210108</enddate><creator>Holley, Guillaume</creator><creator>Beyter, Doruk</creator><creator>Ingimundardottir, Helga</creator><creator>Møller, Peter L</creator><creator>Kristmundsdottir, Snædis</creator><creator>Eggertsson, Hannes P</creator><creator>Halldorsson, Bjarni V</creator><general>BioMed Central</general><general>BMC</general><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7X7</scope><scope>7XB</scope><scope>88E</scope><scope>8FE</scope><scope>8FH</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>K9.</scope><scope>LK8</scope><scope>M0S</scope><scope>M1P</scope><scope>M7P</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7X8</scope><scope>5PM</scope><scope>DOA</scope></search><sort><creationdate>20210108</creationdate><title>Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly</title><author>Holley, Guillaume ; Beyter, Doruk ; Ingimundardottir, Helga ; Møller, Peter L ; Kristmundsdottir, Snædis ; Eggertsson, Hannes P ; Halldorsson, Bjarni V</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c631t-b9467a631868ebe553494f337386bd5fd4c251a704ac76111497b430c50f3423</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Error correction & detection</topic><topic>Genomes</topic><topic>Genomics</topic><topic>Haplotypes</topic><topic>Method</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Holley, Guillaume</creatorcontrib><creatorcontrib>Beyter, Doruk</creatorcontrib><creatorcontrib>Ingimundardottir, Helga</creatorcontrib><creatorcontrib>Møller, Peter L</creatorcontrib><creatorcontrib>Kristmundsdottir, Snædis</creatorcontrib><creatorcontrib>Eggertsson, Hannes P</creatorcontrib><creatorcontrib>Halldorsson, Bjarni V</creatorcontrib><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Health & Medical Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ProQuest Central (Alumni Edition)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>ProQuest Biological Science Collection</collection><collection>Health & Medical Collection (Alumni Edition)</collection><collection>PML(ProQuest Medical Library)</collection><collection>Biological Science Database</collection><collection>Publicly Available Content (ProQuest)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>Genome Biology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Holley, Guillaume</au><au>Beyter, Doruk</au><au>Ingimundardottir, Helga</au><au>Møller, Peter L</au><au>Kristmundsdottir, Snædis</au><au>Eggertsson, Hannes P</au><au>Halldorsson, Bjarni V</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly</atitle><jtitle>Genome Biology</jtitle><addtitle>Genome Biol</addtitle><date>2021-01-08</date><risdate>2021</risdate><volume>22</volume><issue>1</issue><spage>28</spage><epage>28</epage><pages>28-28</pages><artnum>28</artnum><issn>1474-760X</issn><issn>1474-7596</issn><eissn>1474-760X</eissn><abstract>A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.</abstract><cop>England</cop><pub>BioMed Central</pub><pmid>33419473</pmid><doi>10.1186/s13059-020-02244-4</doi><tpages>1</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1474-760X |
ispartof | Genome Biology, 2021-01, Vol.22 (1), p.28-28, Article 28 |
issn | 1474-760X 1474-7596 1474-760X |
language | eng |
recordid | cdi_doaj_primary_oai_doaj_org_article_ae95bd781a544d83a00154b7906d8e9c |
source | Publicly Available Content (ProQuest); PubMed Central |
subjects | Error correction & detection Genomes Genomics Haplotypes Method |
title | Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-11T01%3A45%3A47IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Ratatosk:%20hybrid%20error%20correction%20of%20long%20reads%20enables%20accurate%20variant%20calling%20and%20assembly&rft.jtitle=Genome%20Biology&rft.au=Holley,%20Guillaume&rft.date=2021-01-08&rft.volume=22&rft.issue=1&rft.spage=28&rft.epage=28&rft.pages=28-28&rft.artnum=28&rft.issn=1474-760X&rft.eissn=1474-760X&rft_id=info:doi/10.1186/s13059-020-02244-4&rft_dat=%3Cproquest_doaj_%3E2476561491%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c631t-b9467a631868ebe553494f337386bd5fd4c251a704ac76111497b430c50f3423%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2478810757&rft_id=info:pmid/33419473&rfr_iscdi=true |