Loading…

On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching

We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with \(p\) processors. Given a static text of length \(n\), we first show how to compute the suffix array interval of a given pattern of length \(m\) in \(O(\frac{m}{p}+ \lg p + \lg\lg p\c...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2016-06
Main Authors: Fischer, Johannes, Köppl, Dominik, Kurpicz, Florian
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We present parallel algorithms for exact and approximate pattern matching with suffix arrays, using a CREW-PRAM with \(p\) processors. Given a static text of length \(n\), we first show how to compute the suffix array interval of a given pattern of length \(m\) in \(O(\frac{m}{p}+ \lg p + \lg\lg p\cdot\lg\lg n)\) time for \(p \le m\). For approximate pattern matching with \(k\) differences or mismatches, we show how to compute all occurrences of a given pattern in \(O(\frac{m^k\sigma^k}{p}\max\left(k,\lg\lg n\right)\!+\!(1+\frac{m}{p}) \lg p\cdot \lg\lg n + \text{occ})\) time, where \(\sigma\) is the size of the alphabet and \(p \le \sigma^k m^k\). The workhorse of our algorithms is a data structure for merging suffix array intervals quickly: Given the suffix array intervals for two patterns \(P\) and \(P'\), we present a data structure for computing the interval of \(PP'\) in \(O(\lg\lg n)\) sequential time, or in \(O(1+\lg_p\lg n)\) parallel time. All our data structures are of size \(O(n)\) bits (in addition to the suffix array).
ISSN:2331-8422