Loading…

Accessing very high dimensional spaces in parallel

Access methods are a fundamental tool on Information Retrieval. However, most of these methods suffer the problem known as the curse of dimensionality when they are applied to objects with very high dimensionality representation spaces, such as text documents. In this paper we introduce a new parall...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of supercomputing 2017, Vol.73 (1), p.176-189
Main Authors: Artigas-Fuentes, F. J., Badía, J. M.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Access methods are a fundamental tool on Information Retrieval. However, most of these methods suffer the problem known as the curse of dimensionality when they are applied to objects with very high dimensionality representation spaces, such as text documents. In this paper we introduce a new parallel access method that uses several graphs as distributed index structure and a kNN search algorithm. Two parallel versions of the search method are presented, one based on master–slave scheme and the other based on a pipeline. A thorough experimental analysis on different datasets shows that our method can process efficiently large flows of queries, compete with other parallel algorithms and obtain at the same time very high quality results.
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-016-1673-3