Loading…
On implementing omega in systems with weak reliability and synchrony assumptions
We study the feasibility and cost of implementing Ω —a fundamental failure detector at the core of many algorithms—in systems with weak reliability and synchrony assumptions. Intuitively, Ω allows processes to eventually elect a common leader. We first give an algorithm that implements Ω in a weak s...
Saved in:
Published in: | Distributed computing 2008-10, Vol.21 (4), p.285-314 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We study the feasibility and cost of implementing
Ω
—a fundamental failure detector at the core of many algorithms—in systems with weak reliability and synchrony assumptions. Intuitively,
Ω
allows processes to eventually elect a common leader. We first give an algorithm that implements
Ω
in a weak system
S
where (a) except for some unknown timely process
s
, all processes may be arbitrarily slow or may crash, and (b) only the output links of
s
are eventually timely (all other links can be arbitrarily slow and lossy). Previously known algorithms for
Ω
worked only in systems that are strictly stronger than
S
in terms of reliability or synchrony assumptions.We next show that algorithms that implement
Ω
in system
S
are necessarily expensive in terms of communication complexity: all correct processes (except possibly one) must send messages forever; moreover, a quad-ratic number of links must carry messages forever. This result holds even for algorithms that tolerate at most one crash. Finally, we show that with a small additional assumption to system
S
—the existence of some unknown correct process whose links can be arbitrarily slow and lossy but fair—there is a communication-efficient algorithm for
Ω
such that eventually only
one
process (the elected leader) sends messages. Some recent experimental results indicate that two of the algorithms for
Ω
described in this paper can be used in dynamically-changing systems and work well in practice [Schiper, Toueg in Proceedings of the 38th International Conference on Dependable Systems and Networks, pp. 207–216 (2008)]. |
---|---|
ISSN: | 0178-2770 1432-0452 |
DOI: | 10.1007/s00446-008-0068-y |