Loading…

Scalable Suffix Sorting on a Multicore Machine

A number of methods have been proposed for suffix sorting on internal memory of RAM and external memory of hard disks. The current best results for suffix sorting on internal or external memory are achieved by several algorithms using the induced sorting (IS) method in various ways. While these algo...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on computers 2020-09, Vol.69 (9), p.1364-1375
Main Authors: Xie, Jing Yi, Nong, Ge, Lao, Bin, Xu, Wentao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A number of methods have been proposed for suffix sorting on internal memory of RAM and external memory of hard disks. The current best results for suffix sorting on internal or external memory are achieved by several algorithms using the induced sorting (IS) method in various ways. While these algorithms are efficient, the internal ones are much different from those external in terms of the algorithm designs. A scalable IS method that can be applied for suffix sorting on both internal and external memory is highly desired. This article proposes a blockwise IS method to facilitate pipelined access on internal memory and sequential I/Os on external memory. The detailed algorithm of using this method for a 4-stage pipeline with multiple threads is described, where multiple threads are applied to parallelize not only the pipelined stages of consecutive blocks but also the tasks within each stage wherever possible. This algorithm is evaluated by our experiments on a set of realistic and artificial datasets to achieve better overall time and space performance than the existing best results from pSACAK, pDSS and pKS. Beside sorting suffixes on internal memory in linear time, the proposed method can be ported to external memory for sorting massive suffixes in linear I/O complexity.
ISSN:0018-9340
1557-9956
DOI:10.1109/TC.2020.2972546