Loading…
Enabling scalable scientific workflow management in the Cloud
Cloud computing is gaining tremendous momentum in both academia and industry. In this context, we define the term “Cloud Workflow” as the specification, execution and provenance tracking of large-scale scientific workflows, as well as the management of data and computing resources to support the exe...
Saved in:
Published in: | Future generation computer systems 2015-05, Vol.46, p.3-16 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Cloud computing is gaining tremendous momentum in both academia and industry. In this context, we define the term “Cloud Workflow” as the specification, execution and provenance tracking of large-scale scientific workflows, as well as the management of data and computing resources to support the execution of large-scale scientific workflows in the Cloud. In this paper, we first analyze the gap between these two complementary technologies, and what it means to bring Clouds and workflows together. Then, we present the key challenges in supporting Cloud workflows, and present our reference framework for scientific workflow management in the Cloud. Last we present our experience in integrating a scientific workflow management system—Swift into the Cloud. We discuss the performance of cluster provisioning within the OpenNebula Cloud platform, the Eucalyptus Cloud platform and Amazon EC2, and we demonstrate the capability and efficiency of the integration using a NASA MODIS image processing workflow and the Montage image mosaic workflow.
Note to practitioners. Scientific workflow management plays a very important role for scientific computing and application coordination, while Cloud computing offers scalability and resource on-demand. We devise autonomous methods to integrate scientific workflow management systems with Cloud platforms and also provision resources for large scale workflows, which can facilitate scientists to easily manage their workflows in the Cloud, and take advantage of large scale Cloud resources. There are a few integration options and many challenges in the process, and the experience we gain will help researchers in migrating their workflow management systems and workflow applications into the Cloud.
•We analyze the major challenges of running scientific workflows on the Cloud.•We propose a reference framework to standardize the integration.•The implementation experience proves that the framework is feasible and extendible.•Cluster-recycling mechanism can improve the resource provisioning efficiency. |
---|---|
ISSN: | 0167-739X 1872-7115 |
DOI: | 10.1016/j.future.2014.10.023 |