Loading…

String Matching Over Compressed Text on Handheld Devices Using Tagged Sub-Optimal Code (TSC)

This paper presents Tagged Sub-optimal code (TSC), a new coding technique to speed up string matching over compressed databases on personal digital assistants (PDA). TSC is a variable-length sub-optimal code that supports minimal prefix property. It always determines its codeword boundary without tr...

Full description

Saved in:
Bibliographic Details
Published in:Real-time systems 2005-03, Vol.29 (2-3), p.227-246
Main Authors: Bellaachia, Abdelghani, AL Rassan, Iehab
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c256t-280ac5c9749e06b35fbee41278d4a25a2d6795ae7099ff160317e5fdc17d23453
container_end_page 246
container_issue 2-3
container_start_page 227
container_title Real-time systems
container_volume 29
creator Bellaachia, Abdelghani
AL Rassan, Iehab
description This paper presents Tagged Sub-optimal code (TSC), a new coding technique to speed up string matching over compressed databases on personal digital assistants (PDA). TSC is a variable-length sub-optimal code that supports minimal prefix property. It always determines its codeword boundary without traversing a tree or lookup table. TSC technique may be beneficial in many types of applications: speeding up string matching over compressed text, and speeding decoding process. This paper also presents two algorithms for string matching over compressed text using TSC (SCTT) and the Byte Pair Encoding (BPE) technique (SCTB). indent Several experiments were conducted to compare the performance of TSC, Byte Pair Encoding (BPE), and Huffman code. Several PDA databases with different record sizes were used: the well-known Calgary dataset and a set of small-sized PDA databases. Experimental results show that SCTT is almost twice as fast as the Huffman-based algorithm. SCTT has also the same performance in search time as the search in uncompressed databases and is faster than the SCTB algorithm. For frequently updated PDA databases such as phone books, to-do list, and memos, SCTT is the recommended method regardless of the size of the average record length, since the time required to compress the updated records using BPE poses significant delays compared to TSC.
doi_str_mv 10.1007/s11241-005-6886-9
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_29077471</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>29077471</sourcerecordid><originalsourceid>FETCH-LOGICAL-c256t-280ac5c9749e06b35fbee41278d4a25a2d6795ae7099ff160317e5fdc17d23453</originalsourceid><addsrcrecordid>eNpd0E1Lw0AQBuBFFKzVH-BtQRA9rO5nNnuU-FGh0kPTm7Bsk0mbkiZxNyn6702oJ08zh2eGmReha0YfGKX6MTDGJSOUKhLFcUTMCZowpQVhIhanaEIN5ySSUpyjixB2dIBMmwn6XHa-rDf4w3XZdmwWB_A4afathxAgxyl8d7ip8czV-RaqHD_Docwg4FUYeeo2m0Et-zVZtF25d9UwnAO-S5fJ_SU6K1wV4OqvTtHq9SVNZmS-eHtPnuYk4yrqCI-py1RmtDRAo7VQxRpAMq7jXDquHM8jbZQDTY0pChZRwTSoIs-YzrmQSkzR7XFv65uvHkJn92XIoKpcDU0fLDdUa6nZAG_-wV3T-3q4zXKujDDGcDEodlSZb0LwUNjWD5_5H8uoHdO2x7TtEKId07ZG_AIe4nA0</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2259399923</pqid></control><display><type>article</type><title>String Matching Over Compressed Text on Handheld Devices Using Tagged Sub-Optimal Code (TSC)</title><source>Springer Link</source><creator>Bellaachia, Abdelghani ; AL Rassan, Iehab</creator><creatorcontrib>Bellaachia, Abdelghani ; AL Rassan, Iehab</creatorcontrib><description>This paper presents Tagged Sub-optimal code (TSC), a new coding technique to speed up string matching over compressed databases on personal digital assistants (PDA). TSC is a variable-length sub-optimal code that supports minimal prefix property. It always determines its codeword boundary without traversing a tree or lookup table. TSC technique may be beneficial in many types of applications: speeding up string matching over compressed text, and speeding decoding process. This paper also presents two algorithms for string matching over compressed text using TSC (SCTT) and the Byte Pair Encoding (BPE) technique (SCTB). indent Several experiments were conducted to compare the performance of TSC, Byte Pair Encoding (BPE), and Huffman code. Several PDA databases with different record sizes were used: the well-known Calgary dataset and a set of small-sized PDA databases. Experimental results show that SCTT is almost twice as fast as the Huffman-based algorithm. SCTT has also the same performance in search time as the search in uncompressed databases and is faster than the SCTB algorithm. For frequently updated PDA databases such as phone books, to-do list, and memos, SCTT is the recommended method regardless of the size of the average record length, since the time required to compress the updated records using BPE poses significant delays compared to TSC.</description><identifier>ISSN: 0922-6443</identifier><identifier>EISSN: 1573-1383</identifier><identifier>DOI: 10.1007/s11241-005-6886-9</identifier><language>eng</language><publisher>New York: Springer Nature B.V</publisher><subject>Algorithms ; Decoding ; Huffman codes ; Lookup tables ; Personal digital assistants ; String matching ; Time compression</subject><ispartof>Real-time systems, 2005-03, Vol.29 (2-3), p.227-246</ispartof><rights>Real-Time Systems is a copyright of Springer, (2005). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c256t-280ac5c9749e06b35fbee41278d4a25a2d6795ae7099ff160317e5fdc17d23453</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Bellaachia, Abdelghani</creatorcontrib><creatorcontrib>AL Rassan, Iehab</creatorcontrib><title>String Matching Over Compressed Text on Handheld Devices Using Tagged Sub-Optimal Code (TSC)</title><title>Real-time systems</title><description>This paper presents Tagged Sub-optimal code (TSC), a new coding technique to speed up string matching over compressed databases on personal digital assistants (PDA). TSC is a variable-length sub-optimal code that supports minimal prefix property. It always determines its codeword boundary without traversing a tree or lookup table. TSC technique may be beneficial in many types of applications: speeding up string matching over compressed text, and speeding decoding process. This paper also presents two algorithms for string matching over compressed text using TSC (SCTT) and the Byte Pair Encoding (BPE) technique (SCTB). indent Several experiments were conducted to compare the performance of TSC, Byte Pair Encoding (BPE), and Huffman code. Several PDA databases with different record sizes were used: the well-known Calgary dataset and a set of small-sized PDA databases. Experimental results show that SCTT is almost twice as fast as the Huffman-based algorithm. SCTT has also the same performance in search time as the search in uncompressed databases and is faster than the SCTB algorithm. For frequently updated PDA databases such as phone books, to-do list, and memos, SCTT is the recommended method regardless of the size of the average record length, since the time required to compress the updated records using BPE poses significant delays compared to TSC.</description><subject>Algorithms</subject><subject>Decoding</subject><subject>Huffman codes</subject><subject>Lookup tables</subject><subject>Personal digital assistants</subject><subject>String matching</subject><subject>Time compression</subject><issn>0922-6443</issn><issn>1573-1383</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><recordid>eNpd0E1Lw0AQBuBFFKzVH-BtQRA9rO5nNnuU-FGh0kPTm7Bsk0mbkiZxNyn6702oJ08zh2eGmReha0YfGKX6MTDGJSOUKhLFcUTMCZowpQVhIhanaEIN5ySSUpyjixB2dIBMmwn6XHa-rDf4w3XZdmwWB_A4afathxAgxyl8d7ip8czV-RaqHD_Docwg4FUYeeo2m0Et-zVZtF25d9UwnAO-S5fJ_SU6K1wV4OqvTtHq9SVNZmS-eHtPnuYk4yrqCI-py1RmtDRAo7VQxRpAMq7jXDquHM8jbZQDTY0pChZRwTSoIs-YzrmQSkzR7XFv65uvHkJn92XIoKpcDU0fLDdUa6nZAG_-wV3T-3q4zXKujDDGcDEodlSZb0LwUNjWD5_5H8uoHdO2x7TtEKId07ZG_AIe4nA0</recordid><startdate>20050301</startdate><enddate>20050301</enddate><creator>Bellaachia, Abdelghani</creator><creator>AL Rassan, Iehab</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>8FE</scope><scope>8FG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>P5Z</scope><scope>P62</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>7QF</scope><scope>7SC</scope><scope>8FD</scope><scope>JG9</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20050301</creationdate><title>String Matching Over Compressed Text on Handheld Devices Using Tagged Sub-Optimal Code (TSC)</title><author>Bellaachia, Abdelghani ; AL Rassan, Iehab</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c256t-280ac5c9749e06b35fbee41278d4a25a2d6795ae7099ff160317e5fdc17d23453</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Algorithms</topic><topic>Decoding</topic><topic>Huffman codes</topic><topic>Lookup tables</topic><topic>Personal digital assistants</topic><topic>String matching</topic><topic>Time compression</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Bellaachia, Abdelghani</creatorcontrib><creatorcontrib>AL Rassan, Iehab</creatorcontrib><collection>CrossRef</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Aluminium Industry Abstracts</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Materials Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Real-time systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Bellaachia, Abdelghani</au><au>AL Rassan, Iehab</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>String Matching Over Compressed Text on Handheld Devices Using Tagged Sub-Optimal Code (TSC)</atitle><jtitle>Real-time systems</jtitle><date>2005-03-01</date><risdate>2005</risdate><volume>29</volume><issue>2-3</issue><spage>227</spage><epage>246</epage><pages>227-246</pages><issn>0922-6443</issn><eissn>1573-1383</eissn><abstract>This paper presents Tagged Sub-optimal code (TSC), a new coding technique to speed up string matching over compressed databases on personal digital assistants (PDA). TSC is a variable-length sub-optimal code that supports minimal prefix property. It always determines its codeword boundary without traversing a tree or lookup table. TSC technique may be beneficial in many types of applications: speeding up string matching over compressed text, and speeding decoding process. This paper also presents two algorithms for string matching over compressed text using TSC (SCTT) and the Byte Pair Encoding (BPE) technique (SCTB). indent Several experiments were conducted to compare the performance of TSC, Byte Pair Encoding (BPE), and Huffman code. Several PDA databases with different record sizes were used: the well-known Calgary dataset and a set of small-sized PDA databases. Experimental results show that SCTT is almost twice as fast as the Huffman-based algorithm. SCTT has also the same performance in search time as the search in uncompressed databases and is faster than the SCTB algorithm. For frequently updated PDA databases such as phone books, to-do list, and memos, SCTT is the recommended method regardless of the size of the average record length, since the time required to compress the updated records using BPE poses significant delays compared to TSC.</abstract><cop>New York</cop><pub>Springer Nature B.V</pub><doi>10.1007/s11241-005-6886-9</doi><tpages>20</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0922-6443
ispartof Real-time systems, 2005-03, Vol.29 (2-3), p.227-246
issn 0922-6443
1573-1383
language eng
recordid cdi_proquest_miscellaneous_29077471
source Springer Link
subjects Algorithms
Decoding
Huffman codes
Lookup tables
Personal digital assistants
String matching
Time compression
title String Matching Over Compressed Text on Handheld Devices Using Tagged Sub-Optimal Code (TSC)
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-01T06%3A58%3A43IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=String%20Matching%20Over%20Compressed%20Text%20on%20Handheld%20Devices%20Using%20Tagged%20Sub-Optimal%20Code%20(TSC)&rft.jtitle=Real-time%20systems&rft.au=Bellaachia,%20Abdelghani&rft.date=2005-03-01&rft.volume=29&rft.issue=2-3&rft.spage=227&rft.epage=246&rft.pages=227-246&rft.issn=0922-6443&rft.eissn=1573-1383&rft_id=info:doi/10.1007/s11241-005-6886-9&rft_dat=%3Cproquest_cross%3E29077471%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c256t-280ac5c9749e06b35fbee41278d4a25a2d6795ae7099ff160317e5fdc17d23453%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2259399923&rft_id=info:pmid/&rfr_iscdi=true