Loading…

Enabling scalable scientific workflow management in the Cloud

Cloud computing is gaining tremendous momentum in both academia and industry. In this context, we define the term “Cloud Workflow” as the specification, execution and provenance tracking of large-scale scientific workflows, as well as the management of data and computing resources to support the exe...

Full description

Saved in:
Bibliographic Details
Published in:Future generation computer systems 2015-05, Vol.46, p.3-16
Main Authors: Zhao, Yong, Li, Youfu, Raicu, Ioan, Lu, Shiyong, Tian, Wenhong, Liu, Heng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Cloud computing is gaining tremendous momentum in both academia and industry. In this context, we define the term “Cloud Workflow” as the specification, execution and provenance tracking of large-scale scientific workflows, as well as the management of data and computing resources to support the execution of large-scale scientific workflows in the Cloud. In this paper, we first analyze the gap between these two complementary technologies, and what it means to bring Clouds and workflows together. Then, we present the key challenges in supporting Cloud workflows, and present our reference framework for scientific workflow management in the Cloud. Last we present our experience in integrating a scientific workflow management system—Swift into the Cloud. We discuss the performance of cluster provisioning within the OpenNebula Cloud platform, the Eucalyptus Cloud platform and Amazon EC2, and we demonstrate the capability and efficiency of the integration using a NASA MODIS image processing workflow and the Montage image mosaic workflow. Note to practitioners. Scientific workflow management plays a very important role for scientific computing and application coordination, while Cloud computing offers scalability and resource on-demand. We devise autonomous methods to integrate scientific workflow management systems with Cloud platforms and also provision resources for large scale workflows, which can facilitate scientists to easily manage their workflows in the Cloud, and take advantage of large scale Cloud resources. There are a few integration options and many challenges in the process, and the experience we gain will help researchers in migrating their workflow management systems and workflow applications into the Cloud. •We analyze the major challenges of running scientific workflows on the Cloud.•We propose a reference framework to standardize the integration.•The implementation experience proves that the framework is feasible and extendible.•Cluster-recycling mechanism can improve the resource provisioning efficiency.
ISSN:0167-739X
1872-7115
DOI:10.1016/j.future.2014.10.023