Loading…

A New String Search Tree with Multiple Alignment

String similarity search is a fundamental problem in computer science and can be applied to various fields of information processing. Construction of an efficient indexing structure is an important challenge to improve search performance. In this paper, we propose a new effective tree structure for...

Full description

Saved in:
Bibliographic Details
Main Authors: Sung-Hwan Kim, Kwanghwi Kim, Hwan-Gue Cho
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:String similarity search is a fundamental problem in computer science and can be applied to various fields of information processing. Construction of an efficient indexing structure is an important challenge to improve search performance. In this paper, we propose a new effective tree structure for string similarity search under Levenshtein distance. We present a bottom-up approach to construct the search tree, the internal nodes of which indicate a multiple-sequence alignment of its child leaf nodes, while a leaf node representing a string element. Defining an operation to merge strings and an algorithm for computing distance to merged strings, we present an efficient search strategy with our proposed tree structure. We also demonstrate the search performance with an experimental evaluation, and verify that our method outperforms the sequential search by at most a factor of 15 in terms of the search time.
DOI:10.1109/CIT.2012.103