Loading…

On the Convergence of Upper Bound Techniques for the Average Length of Longest Common Subsequences

It has long been known [2] that the average length of the longest common subsequence of two random strings of length n over an alphabet of size k is asymptotic to γ^sub k^n for some constant γ^sub k^ depending on k. The value of these constants remains unknown, and a number of papers have proved upp...

Full description

Saved in:

Bibliographic Details
Main Author:	Lueker, George S
Format:	Conference Proceeding
Language:	English
Subjects:	Codes
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	It has long been known [2] that the average length of the longest common subsequence of two random strings of length n over an alphabet of size k is asymptotic to γ^sub k^n for some constant γ^sub k^ depending on k. The value of these constants remains unknown, and a number of papers have proved upper and lower bounds on them. In particular, in [6] we used a modification of methods of [3, 4] for determining lower and upper bounds on γ^sub k^, combined with large computer computations, to obtain improved bounds on γ^sub 2^. The method of [6] involved a parameter h; empirically, increasing h increased the computation time but gave better upper bounds. Here we show, for arbitrary k, a sufficient condition for a parameterized method to produce a sequence of upper bounds approaching the true value of γ^sub k^, and show that a generalization of the method of [6] meets this condition for all k ≥ 2. While [3, 4] do not explicitly discuss how to parameterize their method, which is based on a concept they call domination, to trade off the tightness of the bound vs. the amount of computation, we discuss a very natural parameterization of their method; for the case of alphabet size k = 2 we conjecture but do not prove that it also meets the sufficient condition and hence also yields a sequence of bounds that converges to the correct value of γ^sub 2^. For k > 2, it does not meet our sufficient condition. Thus we leave open the question of whether some method based on the undominated collations of [3, 4] gives bounds converging to the correct value for any k ≥ 2. [PUBLICATION ABSTRACT]
ISSN:	2164-0343