Loading…
Compressed suffix arrays and suffix trees with applications to text indexing and string matching
The proliferation of online text, such as found on the World Wide Web and in online databases, motivates the need for space-efficient text indexing methods that support fast string searching. We model this scenario as follows: Consider a text $T$ consisting of $n$ symbols drawn from a fixed alphabet...
Saved in:
Published in: | SIAM journal on computing 2005-01, Vol.35 (2), p.378-407 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c345t-7e9f162041489d0a61638ed2244a75dd0768eec3e9bb58ca564cedaa5fbd28b83 |
---|---|
cites | cdi_FETCH-LOGICAL-c345t-7e9f162041489d0a61638ed2244a75dd0768eec3e9bb58ca564cedaa5fbd28b83 |
container_end_page | 407 |
container_issue | 2 |
container_start_page | 378 |
container_title | SIAM journal on computing |
container_volume | 35 |
creator | GROSSI, Roberto VITTER, Jeffrey Scott |
description | The proliferation of online text, such as found on the World Wide Web and in online databases, motivates the need for space-efficient text indexing methods that support fast string searching. We model this scenario as follows: Consider a text $T$ consisting of $n$ symbols drawn from a fixed alphabet $\Sigma$. The text $T$ can be represented in $n \lg |\Sigma|$ bits by encoding each symbol with $\lg |\Sigma|$ bits. The goal is to support fast online queries for searching any string pattern $P$ of $m$ symbols, with $T$ being fully scanned only once, namely, when the index is created at preprocessing time. The text indexing schemes published in the literature are greedy in terms of space usage: they require $\Omega(n \lg n)$ additional bits of space in the worst case. For example, in the standard unit cost RAM, suffix trees and suffix arrays need $\Omega(n)$ memory words, each of $\Omega(\lg n)$ bits. These indexes are larger than the text itself by a multiplicative factor of $\Omega(\smash{\lg_{|\Sigma|} n})$, which is significant when $\Sigma$ is of constant size, such as in \textsc{ascii} or \textsc{unicode}. On the other hand, these indexes support fast searching, either in $O(m \lg |\Sigma|)$ time or in $O(m +\lg n)$ time, plus an output-sensitive cost $O(\mathit{occ})$ for listing the $\mathit{occ}$ pattern occurrences. We present a new text index that is based upon compressed representations of suffix arrays and suffix trees. It achieves a fast $\smash{O(m /\lg_{|\Sigma|} n + \lg_{|\Sigma|}^\epsilon n)}$ search time in the worst case, for any constant $0 < \epsilon \leq 1$, using at most $\smash{\bigl(\epsilon^{-1} + O(1)\bigr) \, n \lg |\Sigma|}$ bits of storage. Our result thus presents for the first time an efficient index whose size is provably linear in the size of the text in the worst case, and for many scenarios, the space is actually sublinear in practice. As a concrete example, the compressed suffix array for a typical 100 MB \textsc{ascii} file can require 30--40 MB or less, while the raw suffix array requires 500 MB. Our theoretical bounds improve \emph{both} time and space of previous indexing schemes. Listing the pattern occurrences introduces a sublogarithmic slowdown factor in the output-sensitive cost, giving $O(\mathit{occ} \, \smash{\lg_{|\Sigma|}^\epsilon n})$ time as a result. When the patterns are sufficiently long, we can use auxiliary data structures in $O(n \lg |\Sigma|)$ bits to obtain a total search bound of $O(m /\lg_{|\Sigm |
doi_str_mv | 10.1137/S0097539702402354 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_918726607</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2574241991</sourcerecordid><originalsourceid>FETCH-LOGICAL-c345t-7e9f162041489d0a61638ed2244a75dd0768eec3e9bb58ca564cedaa5fbd28b83</originalsourceid><addsrcrecordid>eNplUMtKw0AUHUTBWv0Ad4PgMjo380qWUnxBwYW6jjczEzulTeLMFNu_N6FFF67u4Z4XHEIugd0AcH37ylipJS81ywXLuRRHZAKslJkGgGMyGels5E_JWYxLxkAI4BPyMevWfXAxOkvjpmn8lmIIuIsU299PCs5F-u3TgmLfr7zB5Ls20tTR5LaJ-ta6rW8_954URrjGZBYDOCcnDa6iuzjcKXl_uH-bPWXzl8fn2d08M1zIlGlXNqByJkAUpWWoQPHC2TwXArW0lmlVOGe4K-taFgalEsZZRNnUNi_qgk_J1T63D93XxsVULbtNaIfKqoRC50oxPYhgLzKhizG4puqDX2PYVcCqccfq346D5_oQjNHgqgnYGh__jFopLbXgP_DDc7M</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>918726607</pqid></control><display><type>article</type><title>Compressed suffix arrays and suffix trees with applications to text indexing and string matching</title><source>EBSCOhost Business Source Ultimate</source><source>SIAM Journals Archive</source><source>ABI/INFORM Global</source><creator>GROSSI, Roberto ; VITTER, Jeffrey Scott</creator><creatorcontrib>GROSSI, Roberto ; VITTER, Jeffrey Scott</creatorcontrib><description>The proliferation of online text, such as found on the World Wide Web and in online databases, motivates the need for space-efficient text indexing methods that support fast string searching. We model this scenario as follows: Consider a text $T$ consisting of $n$ symbols drawn from a fixed alphabet $\Sigma$. The text $T$ can be represented in $n \lg |\Sigma|$ bits by encoding each symbol with $\lg |\Sigma|$ bits. The goal is to support fast online queries for searching any string pattern $P$ of $m$ symbols, with $T$ being fully scanned only once, namely, when the index is created at preprocessing time. The text indexing schemes published in the literature are greedy in terms of space usage: they require $\Omega(n \lg n)$ additional bits of space in the worst case. For example, in the standard unit cost RAM, suffix trees and suffix arrays need $\Omega(n)$ memory words, each of $\Omega(\lg n)$ bits. These indexes are larger than the text itself by a multiplicative factor of $\Omega(\smash{\lg_{|\Sigma|} n})$, which is significant when $\Sigma$ is of constant size, such as in \textsc{ascii} or \textsc{unicode}. On the other hand, these indexes support fast searching, either in $O(m \lg |\Sigma|)$ time or in $O(m +\lg n)$ time, plus an output-sensitive cost $O(\mathit{occ})$ for listing the $\mathit{occ}$ pattern occurrences. We present a new text index that is based upon compressed representations of suffix arrays and suffix trees. It achieves a fast $\smash{O(m /\lg_{|\Sigma|} n + \lg_{|\Sigma|}^\epsilon n)}$ search time in the worst case, for any constant $0 < \epsilon \leq 1$, using at most $\smash{\bigl(\epsilon^{-1} + O(1)\bigr) \, n \lg |\Sigma|}$ bits of storage. Our result thus presents for the first time an efficient index whose size is provably linear in the size of the text in the worst case, and for many scenarios, the space is actually sublinear in practice. As a concrete example, the compressed suffix array for a typical 100 MB \textsc{ascii} file can require 30--40 MB or less, while the raw suffix array requires 500 MB. Our theoretical bounds improve \emph{both} time and space of previous indexing schemes. Listing the pattern occurrences introduces a sublogarithmic slowdown factor in the output-sensitive cost, giving $O(\mathit{occ} \, \smash{\lg_{|\Sigma|}^\epsilon n})$ time as a result. When the patterns are sufficiently long, we can use auxiliary data structures in $O(n \lg |\Sigma|)$ bits to obtain a total search bound of $O(m /\lg_{|\Sigma|} n + \mathit{occ})$ time, which is optimal.</description><identifier>ISSN: 0097-5397</identifier><identifier>EISSN: 1095-7111</identifier><identifier>DOI: 10.1137/S0097539702402354</identifier><language>eng</language><publisher>Philadelphia, PA: Society for Industrial and Applied Mathematics</publisher><subject>Algorithmics. Computability. Computer arithmetics ; Applied sciences ; Arrays ; Boolean ; Computer science; control theory; systems ; Exact sciences and technology ; Information systems. Data bases ; Memory organisation. Data processing ; Online data bases ; Queries ; Software ; Theoretical computing ; World Wide Web</subject><ispartof>SIAM journal on computing, 2005-01, Vol.35 (2), p.378-407</ispartof><rights>2006 INIST-CNRS</rights><rights>[Copyright] © 2005 Society for Industrial and Applied Mathematics</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c345t-7e9f162041489d0a61638ed2244a75dd0768eec3e9bb58ca564cedaa5fbd28b83</citedby><cites>FETCH-LOGICAL-c345t-7e9f162041489d0a61638ed2244a75dd0768eec3e9bb58ca564cedaa5fbd28b83</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/918726607?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,776,780,3172,11667,27901,27902,36037,44339</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=17667574$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>GROSSI, Roberto</creatorcontrib><creatorcontrib>VITTER, Jeffrey Scott</creatorcontrib><title>Compressed suffix arrays and suffix trees with applications to text indexing and string matching</title><title>SIAM journal on computing</title><description>The proliferation of online text, such as found on the World Wide Web and in online databases, motivates the need for space-efficient text indexing methods that support fast string searching. We model this scenario as follows: Consider a text $T$ consisting of $n$ symbols drawn from a fixed alphabet $\Sigma$. The text $T$ can be represented in $n \lg |\Sigma|$ bits by encoding each symbol with $\lg |\Sigma|$ bits. The goal is to support fast online queries for searching any string pattern $P$ of $m$ symbols, with $T$ being fully scanned only once, namely, when the index is created at preprocessing time. The text indexing schemes published in the literature are greedy in terms of space usage: they require $\Omega(n \lg n)$ additional bits of space in the worst case. For example, in the standard unit cost RAM, suffix trees and suffix arrays need $\Omega(n)$ memory words, each of $\Omega(\lg n)$ bits. These indexes are larger than the text itself by a multiplicative factor of $\Omega(\smash{\lg_{|\Sigma|} n})$, which is significant when $\Sigma$ is of constant size, such as in \textsc{ascii} or \textsc{unicode}. On the other hand, these indexes support fast searching, either in $O(m \lg |\Sigma|)$ time or in $O(m +\lg n)$ time, plus an output-sensitive cost $O(\mathit{occ})$ for listing the $\mathit{occ}$ pattern occurrences. We present a new text index that is based upon compressed representations of suffix arrays and suffix trees. It achieves a fast $\smash{O(m /\lg_{|\Sigma|} n + \lg_{|\Sigma|}^\epsilon n)}$ search time in the worst case, for any constant $0 < \epsilon \leq 1$, using at most $\smash{\bigl(\epsilon^{-1} + O(1)\bigr) \, n \lg |\Sigma|}$ bits of storage. Our result thus presents for the first time an efficient index whose size is provably linear in the size of the text in the worst case, and for many scenarios, the space is actually sublinear in practice. As a concrete example, the compressed suffix array for a typical 100 MB \textsc{ascii} file can require 30--40 MB or less, while the raw suffix array requires 500 MB. Our theoretical bounds improve \emph{both} time and space of previous indexing schemes. Listing the pattern occurrences introduces a sublogarithmic slowdown factor in the output-sensitive cost, giving $O(\mathit{occ} \, \smash{\lg_{|\Sigma|}^\epsilon n})$ time as a result. When the patterns are sufficiently long, we can use auxiliary data structures in $O(n \lg |\Sigma|)$ bits to obtain a total search bound of $O(m /\lg_{|\Sigma|} n + \mathit{occ})$ time, which is optimal.</description><subject>Algorithmics. Computability. Computer arithmetics</subject><subject>Applied sciences</subject><subject>Arrays</subject><subject>Boolean</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Information systems. Data bases</subject><subject>Memory organisation. Data processing</subject><subject>Online data bases</subject><subject>Queries</subject><subject>Software</subject><subject>Theoretical computing</subject><subject>World Wide Web</subject><issn>0097-5397</issn><issn>1095-7111</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2005</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNplUMtKw0AUHUTBWv0Ad4PgMjo380qWUnxBwYW6jjczEzulTeLMFNu_N6FFF67u4Z4XHEIugd0AcH37ylipJS81ywXLuRRHZAKslJkGgGMyGels5E_JWYxLxkAI4BPyMevWfXAxOkvjpmn8lmIIuIsU299PCs5F-u3TgmLfr7zB5Ls20tTR5LaJ-ta6rW8_954URrjGZBYDOCcnDa6iuzjcKXl_uH-bPWXzl8fn2d08M1zIlGlXNqByJkAUpWWoQPHC2TwXArW0lmlVOGe4K-taFgalEsZZRNnUNi_qgk_J1T63D93XxsVULbtNaIfKqoRC50oxPYhgLzKhizG4puqDX2PYVcCqccfq346D5_oQjNHgqgnYGh__jFopLbXgP_DDc7M</recordid><startdate>20050101</startdate><enddate>20050101</enddate><creator>GROSSI, Roberto</creator><creator>VITTER, Jeffrey Scott</creator><general>Society for Industrial and Applied Mathematics</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7RQ</scope><scope>7WY</scope><scope>7WZ</scope><scope>7X2</scope><scope>7XB</scope><scope>87Z</scope><scope>88A</scope><scope>88F</scope><scope>88I</scope><scope>88K</scope><scope>8AL</scope><scope>8FE</scope><scope>8FG</scope><scope>8FH</scope><scope>8FK</scope><scope>8FL</scope><scope>8G5</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>ATCPS</scope><scope>AZQEC</scope><scope>BBNVY</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>BHPHI</scope><scope>CCPQU</scope><scope>D1I</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>KB.</scope><scope>L.-</scope><scope>L6V</scope><scope>LK8</scope><scope>M0C</scope><scope>M0K</scope><scope>M0N</scope><scope>M1Q</scope><scope>M2O</scope><scope>M2P</scope><scope>M2T</scope><scope>M7P</scope><scope>M7S</scope><scope>MBDVC</scope><scope>P5Z</scope><scope>P62</scope><scope>PATMY</scope><scope>PDBOC</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PTHSS</scope><scope>PYCSY</scope><scope>Q9U</scope><scope>S0W</scope><scope>U9A</scope></search><sort><creationdate>20050101</creationdate><title>Compressed suffix arrays and suffix trees with applications to text indexing and string matching</title><author>GROSSI, Roberto ; VITTER, Jeffrey Scott</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c345t-7e9f162041489d0a61638ed2244a75dd0768eec3e9bb58ca564cedaa5fbd28b83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2005</creationdate><topic>Algorithmics. Computability. Computer arithmetics</topic><topic>Applied sciences</topic><topic>Arrays</topic><topic>Boolean</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Information systems. Data bases</topic><topic>Memory organisation. Data processing</topic><topic>Online data bases</topic><topic>Queries</topic><topic>Software</topic><topic>Theoretical computing</topic><topic>World Wide Web</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>GROSSI, Roberto</creatorcontrib><creatorcontrib>VITTER, Jeffrey Scott</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Career & Technical Education Database</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>Agricultural Science Collection</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Biology Database (Alumni Edition)</collection><collection>Military Database (Alumni Edition)</collection><collection>Science Database (Alumni Edition)</collection><collection>Telecommunications (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Natural Science Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Research Library (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>Agricultural & Environmental Science Collection</collection><collection>ProQuest Central Essentials</collection><collection>Biological Science Collection</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>Natural Science Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Materials Science Collection</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>Materials Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Biological Sciences</collection><collection>ABI/INFORM Global</collection><collection>Agriculture Science Database</collection><collection>Computing Database</collection><collection>Military Database</collection><collection>Research Library</collection><collection>Science Database</collection><collection>Telecommunications Database</collection><collection>Biological Science Database</collection><collection>Engineering Database</collection><collection>Research Library (Corporate)</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Environmental Science Database</collection><collection>Materials Science Collection</collection><collection>One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>Engineering Collection</collection><collection>Environmental Science Collection</collection><collection>ProQuest Central Basic</collection><collection>DELNET Engineering & Technology Collection</collection><jtitle>SIAM journal on computing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>GROSSI, Roberto</au><au>VITTER, Jeffrey Scott</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Compressed suffix arrays and suffix trees with applications to text indexing and string matching</atitle><jtitle>SIAM journal on computing</jtitle><date>2005-01-01</date><risdate>2005</risdate><volume>35</volume><issue>2</issue><spage>378</spage><epage>407</epage><pages>378-407</pages><issn>0097-5397</issn><eissn>1095-7111</eissn><abstract>The proliferation of online text, such as found on the World Wide Web and in online databases, motivates the need for space-efficient text indexing methods that support fast string searching. We model this scenario as follows: Consider a text $T$ consisting of $n$ symbols drawn from a fixed alphabet $\Sigma$. The text $T$ can be represented in $n \lg |\Sigma|$ bits by encoding each symbol with $\lg |\Sigma|$ bits. The goal is to support fast online queries for searching any string pattern $P$ of $m$ symbols, with $T$ being fully scanned only once, namely, when the index is created at preprocessing time. The text indexing schemes published in the literature are greedy in terms of space usage: they require $\Omega(n \lg n)$ additional bits of space in the worst case. For example, in the standard unit cost RAM, suffix trees and suffix arrays need $\Omega(n)$ memory words, each of $\Omega(\lg n)$ bits. These indexes are larger than the text itself by a multiplicative factor of $\Omega(\smash{\lg_{|\Sigma|} n})$, which is significant when $\Sigma$ is of constant size, such as in \textsc{ascii} or \textsc{unicode}. On the other hand, these indexes support fast searching, either in $O(m \lg |\Sigma|)$ time or in $O(m +\lg n)$ time, plus an output-sensitive cost $O(\mathit{occ})$ for listing the $\mathit{occ}$ pattern occurrences. We present a new text index that is based upon compressed representations of suffix arrays and suffix trees. It achieves a fast $\smash{O(m /\lg_{|\Sigma|} n + \lg_{|\Sigma|}^\epsilon n)}$ search time in the worst case, for any constant $0 < \epsilon \leq 1$, using at most $\smash{\bigl(\epsilon^{-1} + O(1)\bigr) \, n \lg |\Sigma|}$ bits of storage. Our result thus presents for the first time an efficient index whose size is provably linear in the size of the text in the worst case, and for many scenarios, the space is actually sublinear in practice. As a concrete example, the compressed suffix array for a typical 100 MB \textsc{ascii} file can require 30--40 MB or less, while the raw suffix array requires 500 MB. Our theoretical bounds improve \emph{both} time and space of previous indexing schemes. Listing the pattern occurrences introduces a sublogarithmic slowdown factor in the output-sensitive cost, giving $O(\mathit{occ} \, \smash{\lg_{|\Sigma|}^\epsilon n})$ time as a result. When the patterns are sufficiently long, we can use auxiliary data structures in $O(n \lg |\Sigma|)$ bits to obtain a total search bound of $O(m /\lg_{|\Sigma|} n + \mathit{occ})$ time, which is optimal.</abstract><cop>Philadelphia, PA</cop><pub>Society for Industrial and Applied Mathematics</pub><doi>10.1137/S0097539702402354</doi><tpages>30</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0097-5397 |
ispartof | SIAM journal on computing, 2005-01, Vol.35 (2), p.378-407 |
issn | 0097-5397 1095-7111 |
language | eng |
recordid | cdi_proquest_journals_918726607 |
source | EBSCOhost Business Source Ultimate; SIAM Journals Archive; ABI/INFORM Global |
subjects | Algorithmics. Computability. Computer arithmetics Applied sciences Arrays Boolean Computer science control theory systems Exact sciences and technology Information systems. Data bases Memory organisation. Data processing Online data bases Queries Software Theoretical computing World Wide Web |
title | Compressed suffix arrays and suffix trees with applications to text indexing and string matching |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-30T23%3A00%3A25IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Compressed%20suffix%20arrays%20and%20suffix%20trees%20with%20applications%20to%20text%20indexing%20and%20string%20matching&rft.jtitle=SIAM%20journal%20on%20computing&rft.au=GROSSI,%20Roberto&rft.date=2005-01-01&rft.volume=35&rft.issue=2&rft.spage=378&rft.epage=407&rft.pages=378-407&rft.issn=0097-5397&rft.eissn=1095-7111&rft_id=info:doi/10.1137/S0097539702402354&rft_dat=%3Cproquest_cross%3E2574241991%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c345t-7e9f162041489d0a61638ed2244a75dd0768eec3e9bb58ca564cedaa5fbd28b83%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=918726607&rft_id=info:pmid/&rfr_iscdi=true |