Loading…

Improving the permutation-based proximity searching algorithm using zones and partial information

•We enrich the permutant based index with zones.•Partialization of the used index: We propose to uses just a part of the index.•We show experimental results and analysis over real and synthetic databases.•We propose to compress information used in the index (zones and permutants). Similarity searchi...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition letters 2017-08, Vol.95, p.29-36
Main Authors: Figueroa, Karina, Paredes, Rodrigo, Camarena-Ibarrola, Antonio, Tejeda-Villela, Héctor
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•We enrich the permutant based index with zones.•Partialization of the used index: We propose to uses just a part of the index.•We show experimental results and analysis over real and synthetic databases.•We propose to compress information used in the index (zones and permutants). Similarity searching is a very useful task in several disciplines such as pattern recognition, machine learning, and decision theory. To solve this task we can use an index to speed up the searching. Among the current indices, the permutant based searching approach has proved its efficiency for high-dimensional data before, however up to now this approach had not been adapted to work with low-dimensional data where the approach seemed useless. We propose several ways to adapt the permutant searching approach for low-dimensional data, using zones varying the distribution of the radii, trying different distance measures, and using partial distance computation as well. After many experiments, we arrived to conclusions about the optimal values of the parameters using a synthetic database of vectors, and then use these learned values on real databases obtaining excellent results for k-nearest neighbor queries, both in high and low dimensional data. [Display omitted]
ISSN:0167-8655
1872-7344
DOI:10.1016/j.patrec.2017.04.012