Loading…

Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly

A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low...

Full description

Saved in:
Bibliographic Details
Published in:Genome Biology 2021-01, Vol.22 (1), p.28-28, Article 28
Main Authors: Holley, Guillaume, Beyter, Doruk, Ingimundardottir, Helga, Møller, Peter L, Kristmundsdottir, Snædis, Eggertsson, Hannes P, Halldorsson, Bjarni V
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.
ISSN:1474-760X
1474-7596
1474-760X
DOI:10.1186/s13059-020-02244-4