Loading…

Noninvasive Fault Classification, Robustness and Recovery Time Measurement in Microprocessor-Type Architectures Subjected to Radiation-Induced Errors

In critical digital designs such as aerospace or safety equipment, radiation-induced upset events (single-event effects or SEEs) can produce adverse effects, and therefore, the ability to compare the sensitivity of various proposed solutions is desirable. As custom-hardened microprocessor solutions...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on instrumentation and measurement 2009-05, Vol.58 (5), p.1514-1524
Main Authors: Guzman-Miranda, H., Aguirre, M.A., Tombs, J.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In critical digital designs such as aerospace or safety equipment, radiation-induced upset events (single-event effects or SEEs) can produce adverse effects, and therefore, the ability to compare the sensitivity of various proposed solutions is desirable. As custom-hardened microprocessor solutions can be very costly, the reliability of various commercial off-the-shelf (COTS) processors can be evaluated to see if there is a commercially available microprocessor or microprocessor-type intellectual property (IP) with adequate robustness for the specific application. Most existing approaches for the measurement of this robustness of the microprocessor involve diverting the program flow and timing to introduce the bit flips via interrupts and embedded handlers added to the application program. In this paper, a tool based on an emulation platform using Xilinx field programmable gate arrays (FPGAs) is described, which provides an environment and methodology for the evaluation of the sensitivity of microprocessor architectures, using dynamic runtime fault injection. A case study is presented, where the robustness of MicroBlaze and Leon3 microprocessors executing a simple signal processing task written in C language is evaluated and compared. A hardened version of the program, where the key variables are protected, has also been tested, and its contributions to system robustness have also been evaluated. In addition, this paper presents a further improvement in the developed tool that allows not only the measurement of microprocessor robustness but, in addition, the study and classification of single-event upset (SEU) effects and the exact measurement of the recovery time (the time that the microprocessor takes to self repair and recover the fault-free state). The measurement of this recovery time is important for real-time critical applications, where criticality depends on both data correctness and timing. To demonstrate the proposed improvements, a new software program that implements two different software hardening techniques (one for Data and another for Control Flow) has been made, and a study of the recovery times in some significant fault-injection cases has been performed over the Leon3 processor.
ISSN:0018-9456
1557-9662
DOI:10.1109/TIM.2009.2014603