Loading…

QUARTIC: QUick pArallel algoRithms for high-Throughput sequencIng data proCessing [version 2; peer review: 2 approved]

Life science has entered the so-called 'big data era' where biologists, clinicians and bioinformaticians are overwhelmed with high-throughput sequencing data. While they offer new insights to decipher the genome structure they also raise major challenges to use them for daily clinical prac...

Full description

Saved in:
Bibliographic Details
Published in:F1000 research 2020, Vol.9, p.240
Main Authors: Jarlier, Frédéric, Joly, Nicolas, Fedy, Nicolas, Magalhaes, Thomas, Sirotti, Leonor, Paganiban, Paul, Martin, Firmin, McManus, Michael, Hupé, Philippe
Format: Article
Language:English
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Life science has entered the so-called 'big data era' where biologists, clinicians and bioinformaticians are overwhelmed with high-throughput sequencing data. While they offer new insights to decipher the genome structure they also raise major challenges to use them for daily clinical practice care and diagnosis purposes as they are bigger and bigger. Therefore, we implemented a software to reduce the time to delivery for the alignment and the sorting of high-throughput sequencing data.  Our solution is implemented using Message Passing Interface and is intended for high-performance computing architecture. The software scales linearly with respect to the size of the data and ensures a total reproducibility with the traditional tools. For example, a 300X whole genome can be aligned and sorted within less than 9 hours with 128 cores. The software offers significant speed-up using multi-cores and multi-nodes parallelization.
ISSN:2046-1402
2046-1402
DOI:10.12688/f1000research.22954.2