Loading…
iBig Hybrid Architecture for Energy IoT: When the Power of Indexing Meets Big Data Processing
Nowadays, IoT data come from multiple sources from a large number of devices. To manage them, IoT frameworks rely on Big Data ecosystems hosted in the cloud to provide scalable storage systems and to achieve scalable processing. Although these ecosystems scale well to process large sizes of data, in...
Saved in:
Main Authors: | , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Nowadays, IoT data come from multiple sources from a large number of devices. To manage them, IoT frameworks rely on Big Data ecosystems hosted in the cloud to provide scalable storage systems and to achieve scalable processing. Although these ecosystems scale well to process large sizes of data, in many cases this is done naively. Many datasets, such as IoT Energy measurement data, consist, even partially, of attributes that can be indexed to avoid unnecessary and costly data scan at these scales. In this work, we propose the iBig architecture thatprovides secondary indexing to Big Data processing for energy IoT datasets. Indexes are considered as metadata stored in a separate component integrated in the ecosystem. Subsequently, MapReduce-based and MPP (massively parallel processing)-based processing leverage indexing to handle only relevant data in a given dataset. Our experimental evaluation on the Grid5000 cloud testbed demonstrates that performance gains can exceed 98% for the MapReduce-based Spark and 81% for the MPP-based Drill for Energy IoT Data. Furthermore, we provide comparison insights about the Spark and Drill frameworks when processing the whole dataset or only with relevant data. |
---|---|
ISSN: | 2380-8004 |
DOI: | 10.1109/CloudCom.2017.38 |