Loading…

Decentralized Task Reallocation on Parallel Computing Architectures Targeting an Avionics Application

This work presents an online decentralized allocation algorithm of a safety-critical application on parallel computing architectures, where individual Computational Units can be affected by faults. The described method includes representing the architecture by an abstract graph where each node repre...

Full description

Saved in:
Bibliographic Details
Published in:Journal of optimization theory and applications 2021-12, Vol.191 (2-3), p.874-898
Main Authors: Khamvilai, Thanakorn, Sutter, Louis, Baufreton, Philippe, Neumann, François, Feron, Eric
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This work presents an online decentralized allocation algorithm of a safety-critical application on parallel computing architectures, where individual Computational Units can be affected by faults. The described method includes representing the architecture by an abstract graph where each node represents a Computational Unit. Applications are also represented by the graph of Computational Units they require for execution. The problem is then to decide how to allocate Computational Units to applications to guarantee execution of a safety-critical application. The problem is formulated as an optimization problem with the form of an Integer Linear Program. A state-of-the-art solver is then used to solve the problem. Decentralizing the allocation process is achieved through redundancy of the allocator executed on the architecture. No centralized element decides on the allocation of the entire architecture, thus improving the reliability of the system. Inspired by multi-core architectures in avionics systems, an experimental illustration of the work is also presented. It is used to demonstrate the capabilities of the proposed allocation process to maintain the operation of a physical system in a decentralized way while individual components fail.
ISSN:0022-3239
1573-2878
DOI:10.1007/s10957-021-01862-7