Loading…

Counting suffix arrays and strings

Suffix arrays are used in various applications and research areas like data compression or computational biology. In this work, our goal is to characterise the combinatorial properties of suffix arrays and their enumeration. For a fixed alphabet size and string length, we divide the set of all strin...

Full description

Saved in:
Bibliographic Details
Published in:Theoretical computer science 2008-05, Vol.395 (2), p.220-234
Main Authors: Schürmann, Klaus-Bernd, Stoye, Jens
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Suffix arrays are used in various applications and research areas like data compression or computational biology. In this work, our goal is to characterise the combinatorial properties of suffix arrays and their enumeration. For a fixed alphabet size and string length, we divide the set of all strings into equivalence classes of strings that share the same suffix array. For each such equivalence class, we count the number of strings contained in it. We also give exact formulas for computing the number of equivalence classes. Our methods yield a lower bound for the compressibility of suffix arrays and build the foundation for the efficient generation of appropriate test data sets for suffix array based algorithms. We also show that summing up the elements of all equivalence classes forms a particular instance for some summation identities of Eulerian numbers.
ISSN:0304-3975
1879-2294
DOI:10.1016/j.tcs.2008.01.011