Loading…
Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2
The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, a...
Saved in:
Published in: | IEEE transactions on parallel and distributed systems 2016-09, Vol.27 (9), p.2574-2588 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773 |
---|---|
cites | cdi_FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773 |
container_end_page | 2588 |
container_issue | 9 |
container_start_page | 2574 |
container_title | IEEE transactions on parallel and distributed systems |
container_volume | 27 |
creator | Marathe, Aniruddha Harris, Rachel Lowenthal, David K. de Supinski, Bronis R. Rountree, Barry Schulz, Martin |
description | The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, auction-based options. The auction market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 auction market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to seven times cheaper than using the on-demand market and up to 44 percent cheaper than the best non-redundant, auction-market algorithm. We extend our adaptive algorithm to incorporate application scalability characteristics for further cost savings. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56 percent cost savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale. |
doi_str_mv | 10.1109/TPDS.2015.2508457 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_TPDS_2015_2508457</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>7355374</ieee_id><sourcerecordid>4143864951</sourcerecordid><originalsourceid>FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773</originalsourceid><addsrcrecordid>eNpdkU9L5EAQxYOsoKt-APHSsJc9mLH_pNPJcYhxXRAUHc-h0qleWjLdMZ2I49FPvj07IrKnVxS_9yjqJckpowvGaHmxurt8WHDK5IJLWmRS7SWHTMoi5awQ3-JMM5mWnJUHyfcQnihlmaTZYfJevw69t5N1f8g9drPrwOkNAdeR5TD0VsNkvSMPGnpobW-nDTF-JJUPU1obg3qyL3hOVnaNaeVdmEawDjtSv6Ke_1m9Idd31de0QOJ6uYa3KHXFj5N9A33Akw89Sh6v6lV1nd7c_vpdLW9SLfJySous5GB0icARjCxNYXLFBVCBPOcq16KjXWE4tK0CRAEdFlJAqyHLRauUOEp-7nKH0T_PGKZmbYPGvgeHfg5NfJTMZck5j-iP_9AnP48uXhcpRqWSecEixXaUHn0II5pmGO0axk3DaLNtpdm20mxbaT5aiZ6zncci4ievhJRCZeIv-KGKCg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1810575681</pqid></control><display><type>article</type><title>Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2</title><source>IEEE Xplore (Online service)</source><creator>Marathe, Aniruddha ; Harris, Rachel ; Lowenthal, David K. ; de Supinski, Bronis R. ; Rountree, Barry ; Schulz, Martin</creator><creatorcontrib>Marathe, Aniruddha ; Harris, Rachel ; Lowenthal, David K. ; de Supinski, Bronis R. ; Rountree, Barry ; Schulz, Martin</creatorcontrib><description>The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, auction-based options. The auction market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 auction market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to seven times cheaper than using the on-demand market and up to 44 percent cheaper than the best non-redundant, auction-market algorithm. We extend our adaptive algorithm to incorporate application scalability characteristics for further cost savings. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56 percent cost savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale.</description><identifier>ISSN: 1045-9219</identifier><identifier>EISSN: 1558-2183</identifier><identifier>DOI: 10.1109/TPDS.2015.2508457</identifier><identifier>CODEN: ITDSEO</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Adaptive algorithms ; Algorithms ; Checkpointing ; cloud computing ; Clouds ; Computational modeling ; Cost control ; Cost engineering ; cost optimization ; Fault tolerance ; Laboratories ; Marketing ; Markets ; Pricing ; Redundancy ; reliability ; Resource management ; resource provisioning ; Risk management ; Scalability</subject><ispartof>IEEE transactions on parallel and distributed systems, 2016-09, Vol.27 (9), p.2574-2588</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2016</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773</citedby><cites>FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/7355374$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Marathe, Aniruddha</creatorcontrib><creatorcontrib>Harris, Rachel</creatorcontrib><creatorcontrib>Lowenthal, David K.</creatorcontrib><creatorcontrib>de Supinski, Bronis R.</creatorcontrib><creatorcontrib>Rountree, Barry</creatorcontrib><creatorcontrib>Schulz, Martin</creatorcontrib><title>Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2</title><title>IEEE transactions on parallel and distributed systems</title><addtitle>TPDS</addtitle><description>The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, auction-based options. The auction market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 auction market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to seven times cheaper than using the on-demand market and up to 44 percent cheaper than the best non-redundant, auction-market algorithm. We extend our adaptive algorithm to incorporate application scalability characteristics for further cost savings. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56 percent cost savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale.</description><subject>Adaptive algorithms</subject><subject>Algorithms</subject><subject>Checkpointing</subject><subject>cloud computing</subject><subject>Clouds</subject><subject>Computational modeling</subject><subject>Cost control</subject><subject>Cost engineering</subject><subject>cost optimization</subject><subject>Fault tolerance</subject><subject>Laboratories</subject><subject>Marketing</subject><subject>Markets</subject><subject>Pricing</subject><subject>Redundancy</subject><subject>reliability</subject><subject>Resource management</subject><subject>resource provisioning</subject><subject>Risk management</subject><subject>Scalability</subject><issn>1045-9219</issn><issn>1558-2183</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNpdkU9L5EAQxYOsoKt-APHSsJc9mLH_pNPJcYhxXRAUHc-h0qleWjLdMZ2I49FPvj07IrKnVxS_9yjqJckpowvGaHmxurt8WHDK5IJLWmRS7SWHTMoi5awQ3-JMM5mWnJUHyfcQnihlmaTZYfJevw69t5N1f8g9drPrwOkNAdeR5TD0VsNkvSMPGnpobW-nDTF-JJUPU1obg3qyL3hOVnaNaeVdmEawDjtSv6Ke_1m9Idd31de0QOJ6uYa3KHXFj5N9A33Akw89Sh6v6lV1nd7c_vpdLW9SLfJySous5GB0icARjCxNYXLFBVCBPOcq16KjXWE4tK0CRAEdFlJAqyHLRauUOEp-7nKH0T_PGKZmbYPGvgeHfg5NfJTMZck5j-iP_9AnP48uXhcpRqWSecEixXaUHn0II5pmGO0axk3DaLNtpdm20mxbaT5aiZ6zncci4ievhJRCZeIv-KGKCg</recordid><startdate>20160901</startdate><enddate>20160901</enddate><creator>Marathe, Aniruddha</creator><creator>Harris, Rachel</creator><creator>Lowenthal, David K.</creator><creator>de Supinski, Bronis R.</creator><creator>Rountree, Barry</creator><creator>Schulz, Martin</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20160901</creationdate><title>Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2</title><author>Marathe, Aniruddha ; Harris, Rachel ; Lowenthal, David K. ; de Supinski, Bronis R. ; Rountree, Barry ; Schulz, Martin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Adaptive algorithms</topic><topic>Algorithms</topic><topic>Checkpointing</topic><topic>cloud computing</topic><topic>Clouds</topic><topic>Computational modeling</topic><topic>Cost control</topic><topic>Cost engineering</topic><topic>cost optimization</topic><topic>Fault tolerance</topic><topic>Laboratories</topic><topic>Marketing</topic><topic>Markets</topic><topic>Pricing</topic><topic>Redundancy</topic><topic>reliability</topic><topic>Resource management</topic><topic>resource provisioning</topic><topic>Risk management</topic><topic>Scalability</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Marathe, Aniruddha</creatorcontrib><creatorcontrib>Harris, Rachel</creatorcontrib><creatorcontrib>Lowenthal, David K.</creatorcontrib><creatorcontrib>de Supinski, Bronis R.</creatorcontrib><creatorcontrib>Rountree, Barry</creatorcontrib><creatorcontrib>Schulz, Martin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on parallel and distributed systems</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Marathe, Aniruddha</au><au>Harris, Rachel</au><au>Lowenthal, David K.</au><au>de Supinski, Bronis R.</au><au>Rountree, Barry</au><au>Schulz, Martin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2</atitle><jtitle>IEEE transactions on parallel and distributed systems</jtitle><stitle>TPDS</stitle><date>2016-09-01</date><risdate>2016</risdate><volume>27</volume><issue>9</issue><spage>2574</spage><epage>2588</epage><pages>2574-2588</pages><issn>1045-9219</issn><eissn>1558-2183</eissn><coden>ITDSEO</coden><abstract>The use of clouds to execute high-performance computing (HPC) applications has greatly increased recently. Clouds provide several potential advantages over traditional supercomputers and in-house clusters. The most popular cloud is currently Amazon EC2, which provides fixed-cost and variable-cost, auction-based options. The auction market trades lower cost for potential interruptions that necessitate checkpointing; if the market price exceeds the bid price, a node is taken away from the user without warning. We explore techniques to maximize performance per dollar given a time constraint within which an application must complete. Specifically, we design and implement multiple techniques to reduce expected cost by exploiting redundancy in the EC2 auction market. We then design an adaptive algorithm that selects a scheduling algorithm and determines the bid price. We show that our adaptive algorithm executes programs up to seven times cheaper than using the on-demand market and up to 44 percent cheaper than the best non-redundant, auction-market algorithm. We extend our adaptive algorithm to incorporate application scalability characteristics for further cost savings. We show that the adaptive algorithm informed with scalability characteristics of applications achieves up to 56 percent cost savings compared to the expected cost for the base adaptive algorithm run at a fixed, user-defined scale.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TPDS.2015.2508457</doi><tpages>15</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 1045-9219 |
ispartof | IEEE transactions on parallel and distributed systems, 2016-09, Vol.27 (9), p.2574-2588 |
issn | 1045-9219 1558-2183 |
language | eng |
recordid | cdi_crossref_primary_10_1109_TPDS_2015_2508457 |
source | IEEE Xplore (Online service) |
subjects | Adaptive algorithms Algorithms Checkpointing cloud computing Clouds Computational modeling Cost control Cost engineering cost optimization Fault tolerance Laboratories Marketing Markets Pricing Redundancy reliability Resource management resource provisioning Risk management Scalability |
title | Exploiting Redundancy and Application Scalability for Cost-Effective, Time-Constrained Execution of HPC Applications on Amazon EC2 |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-25T17%3A35%3A24IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Exploiting%20Redundancy%20and%20Application%20Scalability%20for%20Cost-Effective,%20Time-Constrained%20Execution%20of%20HPC%20Applications%20on%20Amazon%20EC2&rft.jtitle=IEEE%20transactions%20on%20parallel%20and%20distributed%20systems&rft.au=Marathe,%20Aniruddha&rft.date=2016-09-01&rft.volume=27&rft.issue=9&rft.spage=2574&rft.epage=2588&rft.pages=2574-2588&rft.issn=1045-9219&rft.eissn=1558-2183&rft.coden=ITDSEO&rft_id=info:doi/10.1109/TPDS.2015.2508457&rft_dat=%3Cproquest_cross%3E4143864951%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c369t-8492afc9ea2eaf59f8f6723a03e26276c3d0d8f2abb7aee3ade853abca463b773%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1810575681&rft_id=info:pmid/&rft_ieee_id=7355374&rfr_iscdi=true |