Loading…

Quality-score guided error correction for short-read sequencing data using CUDA

Recently introduced new sequencing technologies can produce massive amounts of short-read data. Detection and correction of sequencing errors in this data is an important but time-consuming pre-processing step for de-novo genome assembly. In this paper, we demonstrate how the quality-score value ass...

Full description

Saved in:
Bibliographic Details
Published in:Procedia computer science 2010-05, Vol.1 (1), p.1129-1138
Main Authors: Shi, Haixiang, Schmidt, Bertil, Liu, Weiguo, Müller-Wittig, Wolfgang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently introduced new sequencing technologies can produce massive amounts of short-read data. Detection and correction of sequencing errors in this data is an important but time-consuming pre-processing step for de-novo genome assembly. In this paper, we demonstrate how the quality-score value associated with each base-call can be integrated in a CUDA-based parallel error correction algorithm. We show that quality-score guided error correction can improve the assembly accuracy of several datasets from the NCBI SRA (Short-Read Archive) in terms of N50-values as well as runtime. We further propose a number of improvements of to our previously published CUDA-EC algorithm to improve its runtime by a factor of up to 1.88.
ISSN:1877-0509
1877-0509
DOI:10.1016/j.procs.2010.04.125