Loading…

sBWT: memory efficient implementation of the hardware-acceleration-friendly Schindler transform for the fast biological sequence mapping

The Full-text index in Minute space (FM-index) derived from the Burrows-Wheeler transform (BWT) is broadly used for fast string matching in large genomes or a huge set of sequencing reads. Several graphic processing unit (GPU) accelerated aligners based on the FM-index have been proposed recently; h...

Full description

Saved in:
Bibliographic Details
Published in:Bioinformatics (Oxford, England) England), 2016-11, Vol.32 (22), p.3498-3500
Main Authors: Chang, Chia-Hua, Chou, Min-Te, Wu, Yi-Chung, Hong, Ting-Wei, Li, Yun-Lung, Yang, Chia-Hsiang, Hung, Jui-Hung
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The Full-text index in Minute space (FM-index) derived from the Burrows-Wheeler transform (BWT) is broadly used for fast string matching in large genomes or a huge set of sequencing reads. Several graphic processing unit (GPU) accelerated aligners based on the FM-index have been proposed recently; however, the construction of the index is still handled by central processing unit (CPU), only parallelized in data level (e.g. by performing blockwise suffix sorting in GPU), or not scalable for large genomes. To fulfill the need for a more practical, hardware-parallelizable indexing and matching approach, we herein propose sBWT based on a BWT variant (i.e. Schindler transform) that can be built with highly simplified hardware-acceleration-friendly algorithms and still suffices accurate and fast string matching in repetitive references. In our tests, the implementation achieves significant speedups in indexing and searching compared with other BWT-based tools and can be applied to a variety of domains. sBWT is implemented in C ++ with CPU-only and GPU-accelerated versions. sBWT is open-source software and is available at http://jhhung.github.io/sBWT/Supplementary information: Supplementary data are available at Bioinformatics online. chyee@ntu.edu.tw or jhhung@nctu.edu.tw (also juihunghung@gmail.com).
ISSN:1367-4803
1367-4811
DOI:10.1093/bioinformatics/btw419