Loading…

A variational calculus approach to optimal checkpoint placement

Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on computers 2001-07, Vol.50 (7), p.699-708
Main Authors: Ling, Yibei, Mi, Jie, Lin, Xiaola
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503
cites cdi_FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503
container_end_page 708
container_issue 7
container_start_page 699
container_title IEEE transactions on computers
container_volume 50
creator Ling, Yibei
Mi, Jie
Lin, Xiaola
description Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links the optimal checkpointing frequency with a general failure rate, with the objective of globally minimizing the total expected cost of checkpointing and recovery. Theoretical result shows that the optimal checkpointing frequency is proportional to the square root of the failure rate and can be uniquely determined by the failure rate (time-varying or constant) if the recovery function is strictly increasing and the failure rate is /spl lambda/(/spl infin/)>0. J.L. Bruno and E.G. Coffman (1997) suggest that optimal checkpointing by its nature is a function of system failure rate, i.e., the time-varying failure rate demands time-varying checkpointing in order to meet the criteria of certain optimality. The results obtained in this paper agree with their viewpoint.
doi_str_mv 10.1109/12.936236
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_884828353</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>936236</ieee_id><sourcerecordid>907959200</sourcerecordid><originalsourceid>FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503</originalsourceid><addsrcrecordid>eNqF0T1PwzAQBmALgUQpDKxMEQOIIeX8FdsTqiq-pEosMFuO66gpbhzsBIl_T6pUDAww3fA-Op3uRegcwwxjULeYzBQtCC0O0ARzLnKleHGIJgBY5ooyOEYnKW0AoCCgJuhunn2aWJuuDo3xmTXe9r5PmWnbGIxdZ13IQtvV2124dva9DXXTZa031m1d052io8r45M72c4reHu5fF0_58uXxeTFf5pYB6XLKoRBUlEoSZhwGIK7iFSjLS2DYUEyJVCWvCraSgilCpTRFaUvgbmUVBzpF1-Pe4ayP3qVOb-tknfemcaFPWoFQXBHYyas_JZGCAhXkfyiASs7wAC9_wU3o4_CupKVkkkjK6YBuRmRjSCm6Srdx-Fr80hj0rhqNiR6rGezFaGvn3I_bh98T84ZK</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>884828353</pqid></control><display><type>article</type><title>A variational calculus approach to optimal checkpoint placement</title><source>IEEE Electronic Library (IEL) Journals</source><creator>Ling, Yibei ; Mi, Jie ; Lin, Xiaola</creator><creatorcontrib>Ling, Yibei ; Mi, Jie ; Lin, Xiaola</creatorcontrib><description>Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links the optimal checkpointing frequency with a general failure rate, with the objective of globally minimizing the total expected cost of checkpointing and recovery. Theoretical result shows that the optimal checkpointing frequency is proportional to the square root of the failure rate and can be uniquely determined by the failure rate (time-varying or constant) if the recovery function is strictly increasing and the failure rate is /spl lambda/(/spl infin/)&gt;0. J.L. Bruno and E.G. Coffman (1997) suggest that optimal checkpointing by its nature is a function of system failure rate, i.e., the time-varying failure rate demands time-varying checkpointing in order to meet the criteria of certain optimality. The results obtained in this paper agree with their viewpoint.</description><identifier>ISSN: 0018-9340</identifier><identifier>EISSN: 1557-9956</identifier><identifier>DOI: 10.1109/12.936236</identifier><identifier>CODEN: ITCOB4</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Availability ; Calculus ; Calculus of variations ; Checkpointing ; Cost function ; Degradation ; Failure rates ; Fault tolerance ; Fault tolerant systems ; Frequency ; Mathematical analysis ; Mathematical model ; Mathematical models ; Optimization ; Recovery ; Studies ; Time varying systems</subject><ispartof>IEEE transactions on computers, 2001-07, Vol.50 (7), p.699-708</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2001</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503</citedby><cites>FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/936236$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Ling, Yibei</creatorcontrib><creatorcontrib>Mi, Jie</creatorcontrib><creatorcontrib>Lin, Xiaola</creatorcontrib><title>A variational calculus approach to optimal checkpoint placement</title><title>IEEE transactions on computers</title><addtitle>TC</addtitle><description>Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links the optimal checkpointing frequency with a general failure rate, with the objective of globally minimizing the total expected cost of checkpointing and recovery. Theoretical result shows that the optimal checkpointing frequency is proportional to the square root of the failure rate and can be uniquely determined by the failure rate (time-varying or constant) if the recovery function is strictly increasing and the failure rate is /spl lambda/(/spl infin/)&gt;0. J.L. Bruno and E.G. Coffman (1997) suggest that optimal checkpointing by its nature is a function of system failure rate, i.e., the time-varying failure rate demands time-varying checkpointing in order to meet the criteria of certain optimality. The results obtained in this paper agree with their viewpoint.</description><subject>Availability</subject><subject>Calculus</subject><subject>Calculus of variations</subject><subject>Checkpointing</subject><subject>Cost function</subject><subject>Degradation</subject><subject>Failure rates</subject><subject>Fault tolerance</subject><subject>Fault tolerant systems</subject><subject>Frequency</subject><subject>Mathematical analysis</subject><subject>Mathematical model</subject><subject>Mathematical models</subject><subject>Optimization</subject><subject>Recovery</subject><subject>Studies</subject><subject>Time varying systems</subject><issn>0018-9340</issn><issn>1557-9956</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2001</creationdate><recordtype>article</recordtype><recordid>eNqF0T1PwzAQBmALgUQpDKxMEQOIIeX8FdsTqiq-pEosMFuO66gpbhzsBIl_T6pUDAww3fA-Op3uRegcwwxjULeYzBQtCC0O0ARzLnKleHGIJgBY5ooyOEYnKW0AoCCgJuhunn2aWJuuDo3xmTXe9r5PmWnbGIxdZ13IQtvV2124dva9DXXTZa031m1d052io8r45M72c4reHu5fF0_58uXxeTFf5pYB6XLKoRBUlEoSZhwGIK7iFSjLS2DYUEyJVCWvCraSgilCpTRFaUvgbmUVBzpF1-Pe4ayP3qVOb-tknfemcaFPWoFQXBHYyas_JZGCAhXkfyiASs7wAC9_wU3o4_CupKVkkkjK6YBuRmRjSCm6Srdx-Fr80hj0rhqNiR6rGezFaGvn3I_bh98T84ZK</recordid><startdate>20010701</startdate><enddate>20010701</enddate><creator>Ling, Yibei</creator><creator>Mi, Jie</creator><creator>Lin, Xiaola</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>F28</scope><scope>FR3</scope></search><sort><creationdate>20010701</creationdate><title>A variational calculus approach to optimal checkpoint placement</title><author>Ling, Yibei ; Mi, Jie ; Lin, Xiaola</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2001</creationdate><topic>Availability</topic><topic>Calculus</topic><topic>Calculus of variations</topic><topic>Checkpointing</topic><topic>Cost function</topic><topic>Degradation</topic><topic>Failure rates</topic><topic>Fault tolerance</topic><topic>Fault tolerant systems</topic><topic>Frequency</topic><topic>Mathematical analysis</topic><topic>Mathematical model</topic><topic>Mathematical models</topic><topic>Optimization</topic><topic>Recovery</topic><topic>Studies</topic><topic>Time varying systems</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Ling, Yibei</creatorcontrib><creatorcontrib>Mi, Jie</creatorcontrib><creatorcontrib>Lin, Xiaola</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Xplore</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ANTE: Abstracts in New Technology &amp; Engineering</collection><collection>Engineering Research Database</collection><jtitle>IEEE transactions on computers</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ling, Yibei</au><au>Mi, Jie</au><au>Lin, Xiaola</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A variational calculus approach to optimal checkpoint placement</atitle><jtitle>IEEE transactions on computers</jtitle><stitle>TC</stitle><date>2001-07-01</date><risdate>2001</risdate><volume>50</volume><issue>7</issue><spage>699</spage><epage>708</epage><pages>699-708</pages><issn>0018-9340</issn><eissn>1557-9956</eissn><coden>ITCOB4</coden><abstract>Checkpointing is an effective fault-tolerant technique for improving system availability and reliability. However, a blind checkpointing placement can result in either performance degradation or expensive recovery cost. By means of the calculus of variations, we derive an explicit formula that links the optimal checkpointing frequency with a general failure rate, with the objective of globally minimizing the total expected cost of checkpointing and recovery. Theoretical result shows that the optimal checkpointing frequency is proportional to the square root of the failure rate and can be uniquely determined by the failure rate (time-varying or constant) if the recovery function is strictly increasing and the failure rate is /spl lambda/(/spl infin/)&gt;0. J.L. Bruno and E.G. Coffman (1997) suggest that optimal checkpointing by its nature is a function of system failure rate, i.e., the time-varying failure rate demands time-varying checkpointing in order to meet the criteria of certain optimality. The results obtained in this paper agree with their viewpoint.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/12.936236</doi><tpages>10</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0018-9340
ispartof IEEE transactions on computers, 2001-07, Vol.50 (7), p.699-708
issn 0018-9340
1557-9956
language eng
recordid cdi_proquest_journals_884828353
source IEEE Electronic Library (IEL) Journals
subjects Availability
Calculus
Calculus of variations
Checkpointing
Cost function
Degradation
Failure rates
Fault tolerance
Fault tolerant systems
Frequency
Mathematical analysis
Mathematical model
Mathematical models
Optimization
Recovery
Studies
Time varying systems
title A variational calculus approach to optimal checkpoint placement
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T21%3A53%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20variational%20calculus%20approach%20to%20optimal%20checkpoint%20placement&rft.jtitle=IEEE%20transactions%20on%20computers&rft.au=Ling,%20Yibei&rft.date=2001-07-01&rft.volume=50&rft.issue=7&rft.spage=699&rft.epage=708&rft.pages=699-708&rft.issn=0018-9340&rft.eissn=1557-9956&rft.coden=ITCOB4&rft_id=info:doi/10.1109/12.936236&rft_dat=%3Cproquest_cross%3E907959200%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c402t-3506737b9824ae1002ef5f09c5b041a313289b5f64d87492388a6bcb05edc9503%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=884828353&rft_id=info:pmid/&rft_ieee_id=936236&rfr_iscdi=true