Loading…
Improving System-Level Lifetime Reliability of Multicore Soft Real-Time Systems
This paper studies the problem of maximizing multicore system lifetime reliability, an important design consideration for many real-time embedded systems. Existing work has investigated the problem, but has neglected important failure mechanisms. Furthermore, most existing algorithms are too slow fo...
Saved in:
Published in: | IEEE transactions on very large scale integration (VLSI) systems 2017-06, Vol.25 (6), p.1895-1905 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | This paper studies the problem of maximizing multicore system lifetime reliability, an important design consideration for many real-time embedded systems. Existing work has investigated the problem, but has neglected important failure mechanisms. Furthermore, most existing algorithms are too slow for online use, and thus cannot address runtime workload and environment variations. This paper presents an online framework that maximizes system lifetime reliability through reliability-aware utilization control. It focuses on homogeneous multicore soft real-time systems. It selectively employs a comprehensive reliability estimation tool to deal with a variety of failure mechanisms at the system level. A model-predictive controller adjusts utilization by manipulating core frequencies, thereby reducing temperature, and an online heuristic adjusts the controller sampling window length to decrease the reliability effects of thermal cycling. Experiments with a real quad-core ARM processor and a simulator demonstrate that the proposed approach improves system mean time to failure by 50% on average and 141% in the best case, compared with existing techniques. |
---|---|
ISSN: | 1063-8210 1557-9999 |
DOI: | 10.1109/TVLSI.2017.2669144 |