Loading…

SeqMatcher: efficient genome sequence matching with AVX-512 extensions

The recent emergence of long-read sequencing technologies has enabled substantial improvements in accuracy and reduced computational costs. Nonetheless, pairwise sequence alignment remains a time-consuming step in common bioinformatics pipelines, becoming a bottleneck in de novo whole-genome assembl...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of supercomputing 2025-01, Vol.81 (1), Article 355
Main Authors: Espinosa, Elena, Quislant, Ricardo, Larrosa, Rafael, Plata, Oscar
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The recent emergence of long-read sequencing technologies has enabled substantial improvements in accuracy and reduced computational costs. Nonetheless, pairwise sequence alignment remains a time-consuming step in common bioinformatics pipelines, becoming a bottleneck in de novo whole-genome assembly. Speeding up this step requires heuristics and the development of memory-frugal and efficient implementations. A promising candidate for all of the above is Myers’ algorithm. However, the state-of-the-art implementations face scalability challenges when dealing with longer reads and large datasets. To address these challenges, we propose SeqMatcher , a fast and memory-frugal genomics sequence aligner. By leveraging the long registers of AVX-512, SeqMatcher reduces the data movement and memory footprint. In a comprehensive performance evaluation, SeqMatcher achieves speedups of up to 12.32x for the unbanded version and 26.70x for the banded version compared to the non-vectorized implementation, along with energy footprint reductions of up to 2.59x. It also outperforms state-of-the-art implementations by factors of up to 29.21x, 17.56x, 13.47x, 9.12x, and 8.81x compared to Edlib , WFA2-lib , SeqAn , BSAlign , and QuickEd , while improving energy consumption with reductions of up to 6.78x.
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-024-06789-0