Loading…

Distributed high-dimensional index creation using Hadoop, HDFS and C++

This paper describes an initial study where the open-source Hadoop parallel and distributed run-time environment is used to speedup the construction phase of a large high-dimensional index. This paper first discusses the typical practical problems developers may run into when porting their code to H...

Full description

Saved in:
Bibliographic Details
Main Authors: Gudmundsson, G. P., Amsaleg, L., Jonsson, B. P.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper describes an initial study where the open-source Hadoop parallel and distributed run-time environment is used to speedup the construction phase of a large high-dimensional index. This paper first discusses the typical practical problems developers may run into when porting their code to Hadoop. It then presents early experimental results showing that the performance gains are substantial when indexing large data sets.
ISSN:1949-3983
1949-3991
DOI:10.1109/CBMI.2012.6269848