Loading…

Spatial indexing of high-dimensional data based on relative approximation

We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly an...

Full description

Saved in:

Bibliographic Details
Published in:	The VLDB journal 2002-10, Vol.11 (2), p.93-108
Main Authors:	SAKURAI, Yasushi, YOSHIKAWA, Masatoshi, UEMURA, Shunsuke, KOJIMA, Haruhiko
Format:	Article
Language:	English
Subjects:	Applied sciences Computer science control theory systems Exact sciences and technology Information systems. Data bases Memory organisation. Data processing Software
Citations:	Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c399t-1db8388baf5429f0ec37b1c0718a589b5c3590f166baaaa14b79418af3082c6b3
cites
container_end_page	108
container_issue	2
container_start_page	93
container_title	The VLDB journal
container_volume	11
creator	SAKURAI, Yasushi YOSHIKAWA, Masatoshi UEMURA, Shunsuke KOJIMA, Haruhiko
description	We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly and thus affect the tree configuration both quantitatively and qualitatively. First, since tree nodes can contain a large number of VBR entries, fanout becomes large, which increases search speed. More importantly, we have a free hand in arranging MBRs and VBRs in the tree nodes. Each A-tree node contains an MBR and its children VBRs. Therefore, by fetching an A-tree node, we can obtain information on the exact position of a parent MBR and the approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the A-tree outperforms the SR-tree and the VA-file in all dimensionalities up to 64 dimensions, which is the highest dimension in our experiments. Additionally, we propose a cost model for the A-tree. We verify the validity of the cost model for synthetic and real data sets.
doi_str_mv	10.1007/s00778-002-0066-9
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_29455039</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>29455039</sourcerecordid><originalsourceid>FETCH-LOGICAL-c399t-1db8388baf5429f0ec37b1c0718a589b5c3590f166baaaa14b79418af3082c6b3</originalsourceid><addsrcrecordid>eNpFUE1PAyEQJUYTa_UHeOOit1VYdhc4msaPJk08qIk3Aiy0mC27wtbUf-9s2sRJmAkz7w2Ph9A1JXeUEH6fIXFREFLCaZpCnqAZkZUsBOefp2hGp6aAOEcXOX8RAJZlPUPLt0GPQXc4xNbtQ1zj3uNNWG-KNmxdzKGPMGz1qLHR2bW4jzi5Djg_DuthSP0-bOHWx0t05nWX3dWxztHH0-P74qVYvT4vFw-rwjIpx4K2RjAhjPZ1VUpPnGXcUEs4FboW0tSW1ZJ42jRGQ9DKcFnBzDMiStsYNke3h73w9vfO5VFtQ7au63R0_S6rUlZ1TZgEID0AbepzTs6rIYHW9KsoUZNp6mCaAi_UZJqaODfH5Tpb3fmkow35nwhfYKCS_QHq0mz7</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>29455039</pqid></control><display><type>article</type><title>Spatial indexing of high-dimensional data based on relative approximation</title><source>Springer Nature</source><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>SAKURAI, Yasushi ; YOSHIKAWA, Masatoshi ; UEMURA, Shunsuke ; KOJIMA, Haruhiko</creator><creatorcontrib>SAKURAI, Yasushi ; YOSHIKAWA, Masatoshi ; UEMURA, Shunsuke ; KOJIMA, Haruhiko</creatorcontrib><description>We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly and thus affect the tree configuration both quantitatively and qualitatively. First, since tree nodes can contain a large number of VBR entries, fanout becomes large, which increases search speed. More importantly, we have a free hand in arranging MBRs and VBRs in the tree nodes. Each A-tree node contains an MBR and its children VBRs. Therefore, by fetching an A-tree node, we can obtain information on the exact position of a parent MBR and the approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the A-tree outperforms the SR-tree and the VA-file in all dimensionalities up to 64 dimensions, which is the highest dimension in our experiments. Additionally, we propose a cost model for the A-tree. We verify the validity of the cost model for synthetic and real data sets.</description><identifier>ISSN: 1066-8888</identifier><identifier>EISSN: 0949-877X</identifier><identifier>DOI: 10.1007/s00778-002-0066-9</identifier><language>eng</language><publisher>Heidelberg: Springer</publisher><subject>Applied sciences ; Computer science; control theory; systems ; Exact sciences and technology ; Information systems. Data bases ; Memory organisation. Data processing ; Software</subject><ispartof>The VLDB journal, 2002-10, Vol.11 (2), p.93-108</ispartof><rights>2003 INIST-CNRS</rights><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c399t-1db8388baf5429f0ec37b1c0718a589b5c3590f166baaaa14b79418af3082c6b3</citedby></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=13993589$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>SAKURAI, Yasushi</creatorcontrib><creatorcontrib>YOSHIKAWA, Masatoshi</creatorcontrib><creatorcontrib>UEMURA, Shunsuke</creatorcontrib><creatorcontrib>KOJIMA, Haruhiko</creatorcontrib><title>Spatial indexing of high-dimensional data based on relative approximation</title><title>The VLDB journal</title><description>We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly and thus affect the tree configuration both quantitatively and qualitatively. First, since tree nodes can contain a large number of VBR entries, fanout becomes large, which increases search speed. More importantly, we have a free hand in arranging MBRs and VBRs in the tree nodes. Each A-tree node contains an MBR and its children VBRs. Therefore, by fetching an A-tree node, we can obtain information on the exact position of a parent MBR and the approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the A-tree outperforms the SR-tree and the VA-file in all dimensionalities up to 64 dimensions, which is the highest dimension in our experiments. Additionally, we propose a cost model for the A-tree. We verify the validity of the cost model for synthetic and real data sets.</description><subject>Applied sciences</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Information systems. Data bases</subject><subject>Memory organisation. Data processing</subject><subject>Software</subject><issn>1066-8888</issn><issn>0949-877X</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2002</creationdate><recordtype>article</recordtype><recordid>eNpFUE1PAyEQJUYTa_UHeOOit1VYdhc4msaPJk08qIk3Aiy0mC27wtbUf-9s2sRJmAkz7w2Ph9A1JXeUEH6fIXFREFLCaZpCnqAZkZUsBOefp2hGp6aAOEcXOX8RAJZlPUPLt0GPQXc4xNbtQ1zj3uNNWG-KNmxdzKGPMGz1qLHR2bW4jzi5Djg_DuthSP0-bOHWx0t05nWX3dWxztHH0-P74qVYvT4vFw-rwjIpx4K2RjAhjPZ1VUpPnGXcUEs4FboW0tSW1ZJ42jRGQ9DKcFnBzDMiStsYNke3h73w9vfO5VFtQ7au63R0_S6rUlZ1TZgEID0AbepzTs6rIYHW9KsoUZNp6mCaAi_UZJqaODfH5Tpb3fmkow35nwhfYKCS_QHq0mz7</recordid><startdate>20021001</startdate><enddate>20021001</enddate><creator>SAKURAI, Yasushi</creator><creator>YOSHIKAWA, Masatoshi</creator><creator>UEMURA, Shunsuke</creator><creator>KOJIMA, Haruhiko</creator><general>Springer</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20021001</creationdate><title>Spatial indexing of high-dimensional data based on relative approximation</title><author>SAKURAI, Yasushi ; YOSHIKAWA, Masatoshi ; UEMURA, Shunsuke ; KOJIMA, Haruhiko</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c399t-1db8388baf5429f0ec37b1c0718a589b5c3590f166baaaa14b79418af3082c6b3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2002</creationdate><topic>Applied sciences</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Information systems. Data bases</topic><topic>Memory organisation. Data processing</topic><topic>Software</topic><toplevel>online_resources</toplevel><creatorcontrib>SAKURAI, Yasushi</creatorcontrib><creatorcontrib>YOSHIKAWA, Masatoshi</creatorcontrib><creatorcontrib>UEMURA, Shunsuke</creatorcontrib><creatorcontrib>KOJIMA, Haruhiko</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>The VLDB journal</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>SAKURAI, Yasushi</au><au>YOSHIKAWA, Masatoshi</au><au>UEMURA, Shunsuke</au><au>KOJIMA, Haruhiko</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Spatial indexing of high-dimensional data based on relative approximation</atitle><jtitle>The VLDB journal</jtitle><date>2002-10-01</date><risdate>2002</risdate><volume>11</volume><issue>2</issue><spage>93</spage><epage>108</epage><pages>93-108</pages><issn>1066-8888</issn><eissn>0949-877X</eissn><abstract>We propose a novel index structure, the A-tree (approximation tree), for similarity searches in high-dimensional data. The basic idea of the A-tree is the introduction of virtual bounding rectangles (VBRs) which contain and approximate MBRs or data objects. VBRs can be represented quite compactly and thus affect the tree configuration both quantitatively and qualitatively. First, since tree nodes can contain a large number of VBR entries, fanout becomes large, which increases search speed. More importantly, we have a free hand in arranging MBRs and VBRs in the tree nodes. Each A-tree node contains an MBR and its children VBRs. Therefore, by fetching an A-tree node, we can obtain information on the exact position of a parent MBR and the approximate position of its children. We have performed experiments using both synthetic and real data sets. For the real data sets, the A-tree outperforms the SR-tree and the VA-file in all dimensionalities up to 64 dimensions, which is the highest dimension in our experiments. Additionally, we propose a cost model for the A-tree. We verify the validity of the cost model for synthetic and real data sets.</abstract><cop>Heidelberg</cop><pub>Springer</pub><doi>10.1007/s00778-002-0066-9</doi><tpages>16</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1066-8888
ispartof	The VLDB journal, 2002-10, Vol.11 (2), p.93-108
issn	1066-8888 0949-877X
language	eng
recordid	cdi_proquest_miscellaneous_29455039
source	Springer Nature; Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
subjects	Applied sciences Computer science control theory systems Exact sciences and technology Information systems. Data bases Memory organisation. Data processing Software
title	Spatial indexing of high-dimensional data based on relative approximation
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-10T02%3A42%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Spatial%20indexing%20of%20high-dimensional%20data%20based%20on%20relative%20approximation&rft.jtitle=The%20VLDB%20journal&rft.au=SAKURAI,%20Yasushi&rft.date=2002-10-01&rft.volume=11&rft.issue=2&rft.spage=93&rft.epage=108&rft.pages=93-108&rft.issn=1066-8888&rft.eissn=0949-877X&rft_id=info:doi/10.1007/s00778-002-0066-9&rft_dat=%3Cproquest_cross%3E29455039%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c399t-1db8388baf5429f0ec37b1c0718a589b5c3590f166baaaa14b79418af3082c6b3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=29455039&rft_id=info:pmid/&rfr_iscdi=true