Loading…

Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2

The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, a...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems 2016-09, Vol.27 (9), p.2574-2588
Main Authors: Marathe, Aniruddha, Harris, Rachel, Lowenthal, David K., de Supinski, Bronis R., Rountree, Barry, Schulz, Martin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773
cites cdi_FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773
container_end_page 2588
container_issue 9
container_start_page 2574
container_title IEEE transactions on parallel and distributed systems
container_volume 27
creator Marathe, Aniruddha
Harris, Rachel
Lowenthal, David K.
de Supinski, Bronis R.
Rountree, Barry
Schulz, Martin
description The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, auction-based options. The auction market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 auction market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to seven times cheaper than using the on-demand market and up to 44 percent cheaper than the best non-redundant, auction-market algorithm. We extend our adaptive algorithm to incorporate application scalability characteristics for further cost savings. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56 percent cost savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale.
doi_str_mv 10.1109/TPDS.2015.2508457
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPDS_2015_2508457</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7355374</ieee_id><sourcerecordid>4143864951</sourcerecordid><originalsourceid>FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773</originalsourceid><addsrcrecordid>eNpdkU9L5EAQxYOsoKt-APHSsJc9mLH_pNPJcYhxXRAUHc-h0qleWjLdMZ2I49FPvj07IrKnVxS_9yjqJckpowvGaHmxurt8WHDK5IJLWmRS7SWHTMoi5awQ3-JMM5mWnJUHyfcQnihlmaTZYfJevw69t5N1f8g9drPrwOkNAdeR5TD0VsNkvSMPGnpobW-nDTF-JJUPU1obg3qyL3hOVnaNaeVdmEawDjtSv6Ke_1m9Idd31de0QOJ6uYa3KHXFj5N9A33Akw89Sh6v6lV1nd7c_vpdLW9SLfJySous5GB0icARjCxNYXLFBVCBPOcq16KjXWE4tK0CRAEdFlJAqyHLRauUOEp-7nKH0T_PGKZmbYPGvgeHfg5NfJTMZck5j-iP_9AnP48uXhcpRqWSecEixXaUHn0II5pmGO0axk3DaLNtpdm20mxbaT5aiZ6zncci4ievhJRCZeIv-KGKCg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1810575681</pqid></control><display><type>article</type><title>Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2</title><source>IEEE Xplore (Online service)</source><creator>Marathe, Aniruddha ; Harris, Rachel ; Lowenthal, David K. ; de Supinski, Bronis R. ; Rountree, Barry ; Schulz, Martin</creator><creatorcontrib>Marathe, Aniruddha ; Harris, Rachel ; Lowenthal, David K. ; de Supinski, Bronis R. ; Rountree, Barry ; Schulz, Martin</creatorcontrib><description>The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, auction-based options. The auction market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 auction market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to seven times cheaper than using the on-demand market and up to 44 percent cheaper than the best non-redundant, auction-market algorithm. We extend our adaptive algorithm to incorporate application scalability characteristics for further cost savings. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56 percent cost savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2015.2508457</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Adaptive algorithms ; Algorithms ; Checkpointing ; cloud computing ; Clouds ; Computational modeling ; Cost control ; Cost engineering ; cost optimization ; Fault tolerance ; Laboratories ; Marketing ; Markets ; Pricing ; Redundancy ; reliability ; Resource management ; resource provisioning ; Risk management ; Scalability</subject><ispartof>IEEE transactions on parallel and distributed systems, 2016-09, Vol.27 (9), p.2574-2588</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773</citedby><cites>FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7355374$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Marathe, Aniruddha</creatorcontrib><creatorcontrib>Harris, Rachel</creatorcontrib><creatorcontrib>Lowenthal, David K.</creatorcontrib><creatorcontrib>de Supinski, Bronis R.</creatorcontrib><creatorcontrib>Rountree, Barry</creatorcontrib><creatorcontrib>Schulz, Martin</creatorcontrib><title>Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, auction-based options. The auction market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 auction market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to seven times cheaper than using the on-demand market and up to 44 percent cheaper than the best non-redundant, auction-market algorithm. We extend our adaptive algorithm to incorporate application scalability characteristics for further cost savings. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56 percent cost savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale.</description><subject>Adaptive algorithms</subject><subject>Algorithms</subject><subject>Checkpointing</subject><subject>cloud computing</subject><subject>Clouds</subject><subject>Computational modeling</subject><subject>Cost control</subject><subject>Cost engineering</subject><subject>cost optimization</subject><subject>Fault tolerance</subject><subject>Laboratories</subject><subject>Marketing</subject><subject>Markets</subject><subject>Pricing</subject><subject>Redundancy</subject><subject>reliability</subject><subject>Resource management</subject><subject>resource provisioning</subject><subject>Risk management</subject><subject>Scalability</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNpdkU9L5EAQxYOsoKt-APHSsJc9mLH_pNPJcYhxXRAUHc-h0qleWjLdMZ2I49FPvj07IrKnVxS_9yjqJckpowvGaHmxurt8WHDK5IJLWmRS7SWHTMoi5awQ3-JMM5mWnJUHyfcQnihlmaTZYfJevw69t5N1f8g9drPrwOkNAdeR5TD0VsNkvSMPGnpobW-nDTF-JJUPU1obg3qyL3hOVnaNaeVdmEawDjtSv6Ke_1m9Idd31de0QOJ6uYa3KHXFj5N9A33Akw89Sh6v6lV1nd7c_vpdLW9SLfJySous5GB0icARjCxNYXLFBVCBPOcq16KjXWE4tK0CRAEdFlJAqyHLRauUOEp-7nKH0T_PGKZmbYPGvgeHfg5NfJTMZck5j-iP_9AnP48uXhcpRqWSecEixXaUHn0II5pmGO0axk3DaLNtpdm20mxbaT5aiZ6zncci4ievhJRCZeIv-KGKCg</recordid><startdate>20160901</startdate><enddate>20160901</enddate><creator>Marathe, Aniruddha</creator><creator>Harris, Rachel</creator><creator>Lowenthal, David K.</creator><creator>de Supinski, Bronis R.</creator><creator>Rountree, Barry</creator><creator>Schulz, Martin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20160901</creationdate><title>Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2</title><author>Marathe, Aniruddha ; Harris, Rachel ; Lowenthal, David K. ; de Supinski, Bronis R. ; Rountree, Barry ; Schulz, Martin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Adaptive algorithms</topic><topic>Algorithms</topic><topic>Checkpointing</topic><topic>cloud computing</topic><topic>Clouds</topic><topic>Computational modeling</topic><topic>Cost control</topic><topic>Cost engineering</topic><topic>cost optimization</topic><topic>Fault tolerance</topic><topic>Laboratories</topic><topic>Marketing</topic><topic>Markets</topic><topic>Pricing</topic><topic>Redundancy</topic><topic>reliability</topic><topic>Resource management</topic><topic>resource provisioning</topic><topic>Risk management</topic><topic>Scalability</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Marathe, Aniruddha</creatorcontrib><creatorcontrib>Harris, Rachel</creatorcontrib><creatorcontrib>Lowenthal, David K.</creatorcontrib><creatorcontrib>de Supinski, Bronis R.</creatorcontrib><creatorcontrib>Rountree, Barry</creatorcontrib><creatorcontrib>Schulz, Martin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Marathe, Aniruddha</au><au>Harris, Rachel</au><au>Lowenthal, David K.</au><au>de Supinski, Bronis R.</au><au>Rountree, Barry</au><au>Schulz, Martin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2016-09-01</date><risdate>2016</risdate><volume>27</volume><issue>9</issue><spage>2574</spage><epage>2588</epage><pages>2574-2588</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, auction-based options. The auction market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 auction market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to seven times cheaper than using the on-demand market and up to 44 percent cheaper than the best non-redundant, auction-market algorithm. We extend our adaptive algorithm to incorporate application scalability characteristics for further cost savings. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56 percent cost savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TPDS.2015.2508457</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1045-9219
ispartof IEEE transactions on parallel and distributed systems, 2016-09, Vol.27 (9), p.2574-2588
issn 1045-9219
1558-2183
language eng
recordid cdi_crossref_primary_10_1109_TPDS_2015_2508457
source IEEE Xplore (Online service)
subjects Adaptive algorithms
Algorithms
Checkpointing
cloud computing
Clouds
Computational modeling
Cost control
Cost engineering
cost optimization
Fault tolerance
Laboratories
Marketing
Markets
Pricing
Redundancy
reliability
Resource management
resource provisioning
Risk management
Scalability
title Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T17%3A35%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploiting%20Redundancy%20and%20Application%20Scalability%20for%20Cost-Effective,%20Time-Constrained%20Execution%20of%20HPC%20Applications%20on%20Amazon%20EC2&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Marathe,%20Aniruddha&rft.date=2016-09-01&rft.volume=27&rft.issue=9&rft.spage=2574&rft.epage=2588&rft.pages=2574-2588&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2015.2508457&rft_dat=%3Cproquest_cross%3E4143864951%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1810575681&rft_id=info:pmid/&rft_ieee_id=7355374&rfr_iscdi=true