Loading…

Parallel and Distributed Dimensionality Reduction of Hyperspectral Data on Cloud Computing Architectures

Cloud computing offers the possibility to store and process massive amounts of remotely sensed hyperspectral data in a distributed way. Dimensionality reduction is an important task in hyperspectral imaging, as hyperspectral data often contains redundancy that can be removed prior to analysis of the...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal of selected topics in applied earth observations and remote sensing 2016-06, Vol.9 (6), p.2270-2278
Main Authors: Wu, Zebin, Li, Yonglong, Plaza, Antonio, Li, Jun, Xiao, Fu, Wei, Zhihui
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Cloud computing offers the possibility to store and process massive amounts of remotely sensed hyperspectral data in a distributed way. Dimensionality reduction is an important task in hyperspectral imaging, as hyperspectral data often contains redundancy that can be removed prior to analysis of the data in repositories. In this regard, the development of dimensionality reduction techniques in cloud computing environments can provide both efficient storage and preprocessing of the data. In this paper, we develop a parallel and distributed implementation of a widely used technique for hyperspectral dimensionality reduction: principal component analysis (PCA), based on cloud computing architectures. Our implementation utilizes Hadoop's distributed file system (HDFS) to realize distributed storage, uses Apache Spark as the computing engine, and is developed based on the map-reduce parallel model, taking full advantage of the high throughput access and high performance distributed computing capabilities of cloud computing environments. We first optimized the traditional PCA algorithm to be well suited for parallel and distributed computing, and then we implemented it on a real cloud computing architecture. Our experimental results, conducted using several hyperspectral datasets, reveal very high performance for the proposed distributed parallel method.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2016.2542193