Loading…
A variational calculus approach to optimal checkpoint placement
Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links...
Saved in:
Published in: | IEEE transactions on computers 2001-07, Vol.50 (7), p.699-708 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503 |
---|---|
cites | cdi_FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503 |
container_end_page | 708 |
container_issue | 7 |
container_start_page | 699 |
container_title | IEEE transactions on computers |
container_volume | 50 |
creator | Ling, Yibei Mi, Jie Lin, Xiaola |
description | Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links the optimal checkpointing frequency with a general failure rate, with the objective of globally minimizing the total expected cost of checkpointing and recovery. Theoretical result shows that the optimal checkpointing frequency is proportional to the square root of the failure rate and can be uniquely determined by the failure rate (time-varying or constant) if the recovery function is strictly increasing and the failure rate is /spl lambda/(/spl infin/)>0. J.L. Bruno and E.G. Coffman (1997) suggest that optimal checkpointing by its nature is a function of system failure rate, i.e., the time-varying failure rate demands time-varying checkpointing in order to meet the criteria of certain optimality. The results obtained in this paper agree with their viewpoint. |
doi_str_mv | 10.1109/12.936236 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_884828353</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>936236</ieee_id><sourcerecordid>907959200</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503</originalsourceid><addsrcrecordid>eNqF0T1PwzAQBmALgUQpDKxMEQOIIeX8FdsTqiq-pEosMFuO66gpbhzsBIl_T6pUDAww3fA-Op3uRegcwwxjULeYzBQtCC0O0ARzLnKleHGIJgBY5ooyOEYnKW0AoCCgJuhunn2aWJuuDo3xmTXe9r5PmWnbGIxdZ13IQtvV2124dva9DXXTZa031m1d052io8r45M72c4reHu5fF0_58uXxeTFf5pYB6XLKoRBUlEoSZhwGIK7iFSjLS2DYUEyJVCWvCraSgilCpTRFaUvgbmUVBzpF1-Pe4ayP3qVOb-tknfemcaFPWoFQXBHYyas_JZGCAhXkfyiASs7wAC9_wU3o4_CupKVkkkjK6YBuRmRjSCm6Srdx-Fr80hj0rhqNiR6rGezFaGvn3I_bh98T84ZK</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884828353</pqid></control><display><type>article</type><title>A variational calculus approach to optimal checkpoint placement</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Ling, Yibei ; Mi, Jie ; Lin, Xiaola</creator><creatorcontrib>Ling, Yibei ; Mi, Jie ; Lin, Xiaola</creatorcontrib><description>Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links the optimal checkpointing frequency with a general failure rate, with the objective of globally minimizing the total expected cost of checkpointing and recovery. Theoretical result shows that the optimal checkpointing frequency is proportional to the square root of the failure rate and can be uniquely determined by the failure rate (time-varying or constant) if the recovery function is strictly increasing and the failure rate is /spl lambda/(/spl infin/)>0. J.L. Bruno and E.G. Coffman (1997) suggest that optimal checkpointing by its nature is a function of system failure rate, i.e., the time-varying failure rate demands time-varying checkpointing in order to meet the criteria of certain optimality. The results obtained in this paper agree with their viewpoint.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/12.936236</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Availability ; Calculus ; Calculus of variations ; Checkpointing ; Cost function ; Degradation ; Failure rates ; Fault tolerance ; Fault tolerant systems ; Frequency ; Mathematical analysis ; Mathematical model ; Mathematical models ; Optimization ; Recovery ; Studies ; Time varying systems</subject><ispartof>IEEE transactions on computers, 2001-07, Vol.50 (7), p.699-708</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2001</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503</citedby><cites>FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/936236$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Ling, Yibei</creatorcontrib><creatorcontrib>Mi, Jie</creatorcontrib><creatorcontrib>Lin, Xiaola</creatorcontrib><title>A variational calculus approach to optimal checkpoint placement</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links the optimal checkpointing frequency with a general failure rate, with the objective of globally minimizing the total expected cost of checkpointing and recovery. Theoretical result shows that the optimal checkpointing frequency is proportional to the square root of the failure rate and can be uniquely determined by the failure rate (time-varying or constant) if the recovery function is strictly increasing and the failure rate is /spl lambda/(/spl infin/)>0. J.L. Bruno and E.G. Coffman (1997) suggest that optimal checkpointing by its nature is a function of system failure rate, i.e., the time-varying failure rate demands time-varying checkpointing in order to meet the criteria of certain optimality. The results obtained in this paper agree with their viewpoint.</description><subject>Availability</subject><subject>Calculus</subject><subject>Calculus of variations</subject><subject>Checkpointing</subject><subject>Cost function</subject><subject>Degradation</subject><subject>Failure rates</subject><subject>Fault tolerance</subject><subject>Fault tolerant systems</subject><subject>Frequency</subject><subject>Mathematical analysis</subject><subject>Mathematical model</subject><subject>Mathematical models</subject><subject>Optimization</subject><subject>Recovery</subject><subject>Studies</subject><subject>Time varying systems</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2001</creationdate><recordtype>article</recordtype><recordid>eNqF0T1PwzAQBmALgUQpDKxMEQOIIeX8FdsTqiq-pEosMFuO66gpbhzsBIl_T6pUDAww3fA-Op3uRegcwwxjULeYzBQtCC0O0ARzLnKleHGIJgBY5ooyOEYnKW0AoCCgJuhunn2aWJuuDo3xmTXe9r5PmWnbGIxdZ13IQtvV2124dva9DXXTZa031m1d052io8r45M72c4reHu5fF0_58uXxeTFf5pYB6XLKoRBUlEoSZhwGIK7iFSjLS2DYUEyJVCWvCraSgilCpTRFaUvgbmUVBzpF1-Pe4ayP3qVOb-tknfemcaFPWoFQXBHYyas_JZGCAhXkfyiASs7wAC9_wU3o4_CupKVkkkjK6YBuRmRjSCm6Srdx-Fr80hj0rhqNiR6rGezFaGvn3I_bh98T84ZK</recordid><startdate>20010701</startdate><enddate>20010701</enddate><creator>Ling, Yibei</creator><creator>Mi, Jie</creator><creator>Lin, Xiaola</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20010701</creationdate><title>A variational calculus approach to optimal checkpoint placement</title><author>Ling, Yibei ; Mi, Jie ; Lin, Xiaola</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Availability</topic><topic>Calculus</topic><topic>Calculus of variations</topic><topic>Checkpointing</topic><topic>Cost function</topic><topic>Degradation</topic><topic>Failure rates</topic><topic>Fault tolerance</topic><topic>Fault tolerant systems</topic><topic>Frequency</topic><topic>Mathematical analysis</topic><topic>Mathematical model</topic><topic>Mathematical models</topic><topic>Optimization</topic><topic>Recovery</topic><topic>Studies</topic><topic>Time varying systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ling, Yibei</creatorcontrib><creatorcontrib>Mi, Jie</creatorcontrib><creatorcontrib>Lin, Xiaola</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology & Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ling, Yibei</au><au>Mi, Jie</au><au>Lin, Xiaola</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A variational calculus approach to optimal checkpoint placement</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2001-07-01</date><risdate>2001</risdate><volume>50</volume><issue>7</issue><spage>699</spage><epage>708</epage><pages>699-708</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links the optimal checkpointing frequency with a general failure rate, with the objective of globally minimizing the total expected cost of checkpointing and recovery. Theoretical result shows that the optimal checkpointing frequency is proportional to the square root of the failure rate and can be uniquely determined by the failure rate (time-varying or constant) if the recovery function is strictly increasing and the failure rate is /spl lambda/(/spl infin/)>0. J.L. Bruno and E.G. Coffman (1997) suggest that optimal checkpointing by its nature is a function of system failure rate, i.e., the time-varying failure rate demands time-varying checkpointing in order to meet the criteria of certain optimality. The results obtained in this paper agree with their viewpoint.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/12.936236</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0018-9340 |
ispartof | IEEE transactions on computers, 2001-07, Vol.50 (7), p.699-708 |
issn | 0018-9340 1557-9956 |
language | eng |
recordid | cdi_proquest_journals_884828353 |
source | IEEE Electronic Library (IEL) Journals |
subjects | Availability Calculus Calculus of variations Checkpointing Cost function Degradation Failure rates Fault tolerance Fault tolerant systems Frequency Mathematical analysis Mathematical model Mathematical models Optimization Recovery Studies Time varying systems |
title | A variational calculus approach to optimal checkpoint placement |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T21%3A53%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20variational%20calculus%20approach%20to%20optimal%20checkpoint%20placement&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Ling,%20Yibei&rft.date=2001-07-01&rft.volume=50&rft.issue=7&rft.spage=699&rft.epage=708&rft.pages=699-708&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/12.936236&rft_dat=%3Cproquest_cross%3E907959200%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=884828353&rft_id=info:pmid/&rft_ieee_id=936236&rfr_iscdi=true |