Loading…

A New String Search Tree with Multiple Alignment

String similarity search is a fundamental problem in computer science and can be applied to various fields of information processing. Construction of an efficient indexing structure is an important challenge to improve search performance. In this paper, we propose a new effective tree structure for...

Full description

Saved in:
Bibliographic Details
Main Authors: Sung-Hwan Kim, Kwanghwi Kim, Hwan-Gue Cho
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 463
container_issue
container_start_page 456
container_title
container_volume
creator Sung-Hwan Kim
Kwanghwi Kim
Hwan-Gue Cho
description String similarity search is a fundamental problem in computer science and can be applied to various fields of information processing. Construction of an efficient indexing structure is an important challenge to improve search performance. In this paper, we propose a new effective tree structure for string similarity search under Levenshtein distance. We present a bottom-up approach to construct the search tree, the internal nodes of which indicate a multiple-sequence alignment of its child leaf nodes, while a leaf node representing a string element. Defining an operation to merge strings and an algorithm for computing distance to merged strings, we present an efficient search strategy with our proposed tree structure. We also demonstrate the search performance with an experimental evaluation, and verify that our method outperforms the sequential search by at most a factor of 15 in terms of the search time.
doi_str_mv 10.1109/CIT.2012.103
format conference_proceeding
fullrecord <record><control><sourceid>ieee_6IE</sourceid><recordid>TN_cdi_ieee_primary_6391942</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6391942</ieee_id><sourcerecordid>6391942</sourcerecordid><originalsourceid>FETCH-LOGICAL-i90t-3e3026e0c76b64577de83e32d03bad73ad37c7bd0a33debab9b1ddcffc9e31243</originalsourceid><addsrcrecordid>eNotjE1Lw0AUAFdEUGtu3rzsH0h9u2-zL3sMwY9C1UNz8FZ2sy_tShpKEin-ewt6GhiGEeJewVIpcI_1qllqUHqpAC_ELZB1hSmL8vNSZI5KZSyhKQnpWmTT9AUA57AA0jcCKvnOJ7mZxzTs5Ib92O5lMzLLU5r38u27n9OxZ1n1aTcceJjvxFXn-4mzfy5E8_zU1K_5-uNlVVfrPDmYc2QEbRlassGagihyeXY6AgYfCX1EailE8IiRgw8uqBjbrmsdo9IGF-Lhb5uYeXsc08GPP1uLTjmj8RfN0ENr</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>A New String Search Tree with Multiple Alignment</title><source>IEEE Electronic Library (IEL) Conference Proceedings</source><creator>Sung-Hwan Kim ; Kwanghwi Kim ; Hwan-Gue Cho</creator><creatorcontrib>Sung-Hwan Kim ; Kwanghwi Kim ; Hwan-Gue Cho</creatorcontrib><description>String similarity search is a fundamental problem in computer science and can be applied to various fields of information processing. Construction of an efficient indexing structure is an important challenge to improve search performance. In this paper, we propose a new effective tree structure for string similarity search under Levenshtein distance. We present a bottom-up approach to construct the search tree, the internal nodes of which indicate a multiple-sequence alignment of its child leaf nodes, while a leaf node representing a string element. Defining an operation to merge strings and an algorithm for computing distance to merged strings, we present an efficient search strategy with our proposed tree structure. We also demonstrate the search performance with an experimental evaluation, and verify that our method outperforms the sequential search by at most a factor of 15 in terms of the search time.</description><identifier>ISBN: 9781467348737</identifier><identifier>ISBN: 1467348732</identifier><identifier>EISBN: 076954858X</identifier><identifier>EISBN: 9780769548586</identifier><identifier>DOI: 10.1109/CIT.2012.103</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>Approximate Dictionary ; Edit Distance ; Equations ; Indexing ; Linear programming ; Merging ; Multiple Sequence Alignment ; Nearest neighbor searches ; Search problems ; String Similarity Search ; Vectors</subject><ispartof>2012 IEEE 12th International Conference on Computer and Information Technology, 2012, p.456-463</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6391942$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,2058,27925,54920</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/6391942$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Sung-Hwan Kim</creatorcontrib><creatorcontrib>Kwanghwi Kim</creatorcontrib><creatorcontrib>Hwan-Gue Cho</creatorcontrib><title>A New String Search Tree with Multiple Alignment</title><title>2012 IEEE 12th International Conference on Computer and Information Technology</title><addtitle>cit</addtitle><description>String similarity search is a fundamental problem in computer science and can be applied to various fields of information processing. Construction of an efficient indexing structure is an important challenge to improve search performance. In this paper, we propose a new effective tree structure for string similarity search under Levenshtein distance. We present a bottom-up approach to construct the search tree, the internal nodes of which indicate a multiple-sequence alignment of its child leaf nodes, while a leaf node representing a string element. Defining an operation to merge strings and an algorithm for computing distance to merged strings, we present an efficient search strategy with our proposed tree structure. We also demonstrate the search performance with an experimental evaluation, and verify that our method outperforms the sequential search by at most a factor of 15 in terms of the search time.</description><subject>Approximate Dictionary</subject><subject>Edit Distance</subject><subject>Equations</subject><subject>Indexing</subject><subject>Linear programming</subject><subject>Merging</subject><subject>Multiple Sequence Alignment</subject><subject>Nearest neighbor searches</subject><subject>Search problems</subject><subject>String Similarity Search</subject><subject>Vectors</subject><isbn>9781467348737</isbn><isbn>1467348732</isbn><isbn>076954858X</isbn><isbn>9780769548586</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2012</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNotjE1Lw0AUAFdEUGtu3rzsH0h9u2-zL3sMwY9C1UNz8FZ2sy_tShpKEin-ewt6GhiGEeJewVIpcI_1qllqUHqpAC_ELZB1hSmL8vNSZI5KZSyhKQnpWmTT9AUA57AA0jcCKvnOJ7mZxzTs5Ib92O5lMzLLU5r38u27n9OxZ1n1aTcceJjvxFXn-4mzfy5E8_zU1K_5-uNlVVfrPDmYc2QEbRlassGagihyeXY6AgYfCX1EailE8IiRgw8uqBjbrmsdo9IGF-Lhb5uYeXsc08GPP1uLTjmj8RfN0ENr</recordid><startdate>201210</startdate><enddate>201210</enddate><creator>Sung-Hwan Kim</creator><creator>Kwanghwi Kim</creator><creator>Hwan-Gue Cho</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>201210</creationdate><title>A New String Search Tree with Multiple Alignment</title><author>Sung-Hwan Kim ; Kwanghwi Kim ; Hwan-Gue Cho</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i90t-3e3026e0c76b64577de83e32d03bad73ad37c7bd0a33debab9b1ddcffc9e31243</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2012</creationdate><topic>Approximate Dictionary</topic><topic>Edit Distance</topic><topic>Equations</topic><topic>Indexing</topic><topic>Linear programming</topic><topic>Merging</topic><topic>Multiple Sequence Alignment</topic><topic>Nearest neighbor searches</topic><topic>Search problems</topic><topic>String Similarity Search</topic><topic>Vectors</topic><toplevel>online_resources</toplevel><creatorcontrib>Sung-Hwan Kim</creatorcontrib><creatorcontrib>Kwanghwi Kim</creatorcontrib><creatorcontrib>Hwan-Gue Cho</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE Explore</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Sung-Hwan Kim</au><au>Kwanghwi Kim</au><au>Hwan-Gue Cho</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>A New String Search Tree with Multiple Alignment</atitle><btitle>2012 IEEE 12th International Conference on Computer and Information Technology</btitle><stitle>cit</stitle><date>2012-10</date><risdate>2012</risdate><spage>456</spage><epage>463</epage><pages>456-463</pages><isbn>9781467348737</isbn><isbn>1467348732</isbn><eisbn>076954858X</eisbn><eisbn>9780769548586</eisbn><coden>IEEPAD</coden><abstract>String similarity search is a fundamental problem in computer science and can be applied to various fields of information processing. Construction of an efficient indexing structure is an important challenge to improve search performance. In this paper, we propose a new effective tree structure for string similarity search under Levenshtein distance. We present a bottom-up approach to construct the search tree, the internal nodes of which indicate a multiple-sequence alignment of its child leaf nodes, while a leaf node representing a string element. Defining an operation to merge strings and an algorithm for computing distance to merged strings, we present an efficient search strategy with our proposed tree structure. We also demonstrate the search performance with an experimental evaluation, and verify that our method outperforms the sequential search by at most a factor of 15 in terms of the search time.</abstract><pub>IEEE</pub><doi>10.1109/CIT.2012.103</doi><tpages>8</tpages></addata></record>
fulltext fulltext_linktorsrc
identifier ISBN: 9781467348737
ispartof 2012 IEEE 12th International Conference on Computer and Information Technology, 2012, p.456-463
issn
language eng
recordid cdi_ieee_primary_6391942
source IEEE Electronic Library (IEL) Conference Proceedings
subjects Approximate Dictionary
Edit Distance
Equations
Indexing
Linear programming
Merging
Multiple Sequence Alignment
Nearest neighbor searches
Search problems
String Similarity Search
Vectors
title A New String Search Tree with Multiple Alignment
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T12%3A14%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_6IE&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=A%20New%20String%20Search%20Tree%20with%20Multiple%20Alignment&rft.btitle=2012%20IEEE%2012th%20International%20Conference%20on%20Computer%20and%20Information%20Technology&rft.au=Sung-Hwan%20Kim&rft.date=2012-10&rft.spage=456&rft.epage=463&rft.pages=456-463&rft.isbn=9781467348737&rft.isbn_list=1467348732&rft.coden=IEEPAD&rft_id=info:doi/10.1109/CIT.2012.103&rft.eisbn=076954858X&rft.eisbn_list=9780769548586&rft_dat=%3Cieee_6IE%3E6391942%3C/ieee_6IE%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i90t-3e3026e0c76b64577de83e32d03bad73ad37c7bd0a33debab9b1ddcffc9e31243%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=6391942&rfr_iscdi=true