Loading…
BEVA: An Efficient Query Processing Algorithm for Error-Tolerant Autocompletion
Query autocompletion has become a standard feature in many search applications, especially for search engines. A recent trend is to support the error-tolerant autocompletion , which increases the usability significantly by matching prefixes of database strings and allowing a small number of errors....
Saved in:
Published in: | ACM transactions on database systems 2016-04, Vol.41 (1), p.1-44 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c220t-964f12eb452656c0fb782d4310ba2375b2f70d25919115ed1266ff5305b786e83 |
container_end_page | 44 |
container_issue | 1 |
container_start_page | 1 |
container_title | ACM transactions on database systems |
container_volume | 41 |
creator | Zhou, Xiaoling Qin, Jianbin Xiao, Chuan Wang, Wei Lin, Xuemin Ishikawa, Yoshiharu |
description | Query autocompletion has become a standard feature in many search applications, especially for search engines. A recent trend is to support the
error-tolerant autocompletion
, which increases the usability significantly by matching prefixes of database strings and allowing a small number of errors.
In this article, we systematically study the query processing problem for error-tolerant autocompletion with a given edit distance threshold. We propose a general framework that encompasses existing methods and characterizes different classes of algorithms and the minimum amount of information they need to maintain under different constraints. We then propose a novel evaluation strategy that achieves the minimum active node size by eliminating ancestor-descendant relationships among active nodes entirely. In addition, we characterize the essence of edit distance computation by a novel data structure named
edit vector automaton
(EVA). It enables us to compute new active nodes and their associated states efficiently by table lookups. In order to support large distance thresholds, we devise a partitioning scheme to reduce the size and construction cost of the automaton, which results in the
universal partitioned EVA
(UPEVA) to handle arbitrarily large thresholds. Our extensive evaluation demonstrates that our proposed method outperforms existing approaches in both space and time efficiencies. |
doi_str_mv | 10.1145/2877201 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1808054133</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1808054133</sourcerecordid><originalsourceid>FETCH-LOGICAL-c220t-964f12eb452656c0fb782d4310ba2375b2f70d25919115ed1266ff5305b786e83</originalsourceid><addsrcrecordid>eNotj0tLAzEURi9SwbGKP0M30XtvcpPMspb6gIIbdRvmkUBl6tSkXfjvHWlX3-bwHQ7ADeE9kZEH9s4x0hlUJOKUscbMoEJtWUlNcgGXpXwhovG1q2D2uPpcXMF5aoYSr087h4-n1fvyRa3fnl-Xi7XqmHGvamsScWyNsBXbYWqd595owrZh7aTl5LDnyVITSeyJrU1JNMoE2uj1HO6Ov7s8_hxi2YftpnRxGJrvOB5KII8exZDWE3p7RLs8lpJjCru82Tb5NxCG_85w6tR_4YY_aA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1808054133</pqid></control><display><type>article</type><title>BEVA: An Efficient Query Processing Algorithm for Error-Tolerant Autocompletion</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><source>BSC - Ebsco (Business Source Ultimate)</source><creator>Zhou, Xiaoling ; Qin, Jianbin ; Xiao, Chuan ; Wang, Wei ; Lin, Xuemin ; Ishikawa, Yoshiharu</creator><creatorcontrib>Zhou, Xiaoling ; Qin, Jianbin ; Xiao, Chuan ; Wang, Wei ; Lin, Xuemin ; Ishikawa, Yoshiharu</creatorcontrib><description>Query autocompletion has become a standard feature in many search applications, especially for search engines. A recent trend is to support the
error-tolerant autocompletion
, which increases the usability significantly by matching prefixes of database strings and allowing a small number of errors.
In this article, we systematically study the query processing problem for error-tolerant autocompletion with a given edit distance threshold. We propose a general framework that encompasses existing methods and characterizes different classes of algorithms and the minimum amount of information they need to maintain under different constraints. We then propose a novel evaluation strategy that achieves the minimum active node size by eliminating ancestor-descendant relationships among active nodes entirely. In addition, we characterize the essence of edit distance computation by a novel data structure named
edit vector automaton
(EVA). It enables us to compute new active nodes and their associated states efficiently by table lookups. In order to support large distance thresholds, we devise a partitioning scheme to reduce the size and construction cost of the automaton, which results in the
universal partitioned EVA
(UPEVA) to handle arbitrarily large thresholds. Our extensive evaluation demonstrates that our proposed method outperforms existing approaches in both space and time efficiencies.</description><identifier>ISSN: 0362-5915</identifier><identifier>EISSN: 1557-4644</identifier><identifier>DOI: 10.1145/2877201</identifier><language>eng</language><subject>Algorithms ; Data structures ; Partitioning ; Query processing ; Search engines ; Searching ; Thresholds</subject><ispartof>ACM transactions on database systems, 2016-04, Vol.41 (1), p.1-44</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c220t-964f12eb452656c0fb782d4310ba2375b2f70d25919115ed1266ff5305b786e83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Zhou, Xiaoling</creatorcontrib><creatorcontrib>Qin, Jianbin</creatorcontrib><creatorcontrib>Xiao, Chuan</creatorcontrib><creatorcontrib>Wang, Wei</creatorcontrib><creatorcontrib>Lin, Xuemin</creatorcontrib><creatorcontrib>Ishikawa, Yoshiharu</creatorcontrib><title>BEVA: An Efficient Query Processing Algorithm for Error-Tolerant Autocompletion</title><title>ACM transactions on database systems</title><description>Query autocompletion has become a standard feature in many search applications, especially for search engines. A recent trend is to support the
error-tolerant autocompletion
, which increases the usability significantly by matching prefixes of database strings and allowing a small number of errors.
In this article, we systematically study the query processing problem for error-tolerant autocompletion with a given edit distance threshold. We propose a general framework that encompasses existing methods and characterizes different classes of algorithms and the minimum amount of information they need to maintain under different constraints. We then propose a novel evaluation strategy that achieves the minimum active node size by eliminating ancestor-descendant relationships among active nodes entirely. In addition, we characterize the essence of edit distance computation by a novel data structure named
edit vector automaton
(EVA). It enables us to compute new active nodes and their associated states efficiently by table lookups. In order to support large distance thresholds, we devise a partitioning scheme to reduce the size and construction cost of the automaton, which results in the
universal partitioned EVA
(UPEVA) to handle arbitrarily large thresholds. Our extensive evaluation demonstrates that our proposed method outperforms existing approaches in both space and time efficiencies.</description><subject>Algorithms</subject><subject>Data structures</subject><subject>Partitioning</subject><subject>Query processing</subject><subject>Search engines</subject><subject>Searching</subject><subject>Thresholds</subject><issn>0362-5915</issn><issn>1557-4644</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNotj0tLAzEURi9SwbGKP0M30XtvcpPMspb6gIIbdRvmkUBl6tSkXfjvHWlX3-bwHQ7ADeE9kZEH9s4x0hlUJOKUscbMoEJtWUlNcgGXpXwhovG1q2D2uPpcXMF5aoYSr087h4-n1fvyRa3fnl-Xi7XqmHGvamsScWyNsBXbYWqd595owrZh7aTl5LDnyVITSeyJrU1JNMoE2uj1HO6Ov7s8_hxi2YftpnRxGJrvOB5KII8exZDWE3p7RLs8lpJjCru82Tb5NxCG_85w6tR_4YY_aA</recordid><startdate>20160401</startdate><enddate>20160401</enddate><creator>Zhou, Xiaoling</creator><creator>Qin, Jianbin</creator><creator>Xiao, Chuan</creator><creator>Wang, Wei</creator><creator>Lin, Xuemin</creator><creator>Ishikawa, Yoshiharu</creator><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20160401</creationdate><title>BEVA</title><author>Zhou, Xiaoling ; Qin, Jianbin ; Xiao, Chuan ; Wang, Wei ; Lin, Xuemin ; Ishikawa, Yoshiharu</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c220t-964f12eb452656c0fb782d4310ba2375b2f70d25919115ed1266ff5305b786e83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Algorithms</topic><topic>Data structures</topic><topic>Partitioning</topic><topic>Query processing</topic><topic>Search engines</topic><topic>Searching</topic><topic>Thresholds</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhou, Xiaoling</creatorcontrib><creatorcontrib>Qin, Jianbin</creatorcontrib><creatorcontrib>Xiao, Chuan</creatorcontrib><creatorcontrib>Wang, Wei</creatorcontrib><creatorcontrib>Lin, Xuemin</creatorcontrib><creatorcontrib>Ishikawa, Yoshiharu</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>ACM transactions on database systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhou, Xiaoling</au><au>Qin, Jianbin</au><au>Xiao, Chuan</au><au>Wang, Wei</au><au>Lin, Xuemin</au><au>Ishikawa, Yoshiharu</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>BEVA: An Efficient Query Processing Algorithm for Error-Tolerant Autocompletion</atitle><jtitle>ACM transactions on database systems</jtitle><date>2016-04-01</date><risdate>2016</risdate><volume>41</volume><issue>1</issue><spage>1</spage><epage>44</epage><pages>1-44</pages><issn>0362-5915</issn><eissn>1557-4644</eissn><abstract>Query autocompletion has become a standard feature in many search applications, especially for search engines. A recent trend is to support the
error-tolerant autocompletion
, which increases the usability significantly by matching prefixes of database strings and allowing a small number of errors.
In this article, we systematically study the query processing problem for error-tolerant autocompletion with a given edit distance threshold. We propose a general framework that encompasses existing methods and characterizes different classes of algorithms and the minimum amount of information they need to maintain under different constraints. We then propose a novel evaluation strategy that achieves the minimum active node size by eliminating ancestor-descendant relationships among active nodes entirely. In addition, we characterize the essence of edit distance computation by a novel data structure named
edit vector automaton
(EVA). It enables us to compute new active nodes and their associated states efficiently by table lookups. In order to support large distance thresholds, we devise a partitioning scheme to reduce the size and construction cost of the automaton, which results in the
universal partitioned EVA
(UPEVA) to handle arbitrarily large thresholds. Our extensive evaluation demonstrates that our proposed method outperforms existing approaches in both space and time efficiencies.</abstract><doi>10.1145/2877201</doi><tpages>44</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0362-5915 |
ispartof | ACM transactions on database systems, 2016-04, Vol.41 (1), p.1-44 |
issn | 0362-5915 1557-4644 |
language | eng |
recordid | cdi_proquest_miscellaneous_1808054133 |
source | Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list); BSC - Ebsco (Business Source Ultimate) |
subjects | Algorithms Data structures Partitioning Query processing Search engines Searching Thresholds |
title | BEVA: An Efficient Query Processing Algorithm for Error-Tolerant Autocompletion |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T23%3A00%3A21IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=BEVA:%20An%20Efficient%20Query%20Processing%20Algorithm%20for%20Error-Tolerant%20Autocompletion&rft.jtitle=ACM%20transactions%20on%20database%20systems&rft.au=Zhou,%20Xiaoling&rft.date=2016-04-01&rft.volume=41&rft.issue=1&rft.spage=1&rft.epage=44&rft.pages=1-44&rft.issn=0362-5915&rft.eissn=1557-4644&rft_id=info:doi/10.1145/2877201&rft_dat=%3Cproquest_cross%3E1808054133%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c220t-964f12eb452656c0fb782d4310ba2375b2f70d25919115ed1266ff5305b786e83%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1808054133&rft_id=info:pmid/&rfr_iscdi=true |