Loading…

A Chaos Engineering System for Live Analysis and Falsification of Exception-Handling in the JVM

Software systems contain resilience code to handle those failures and unexpected events happening in production. It is essential for developers to understand and assess the resilience of their systems. Chaos engineering is a technology that aims at assessing resilience and uncovering weaknesses by a...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on software engineering 2021-11, Vol.47 (11), p.2534-2548
Main Authors: Zhang, Long, Morin, Brice, Haller, Philipp, Baudry, Benoit, Monperrus, Martin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Software systems contain resilience code to handle those failures and unexpected events happening in production. It is essential for developers to understand and assess the resilience of their systems. Chaos engineering is a technology that aims at assessing resilience and uncovering weaknesses by actively injecting perturbations in production. In this paper, we propose a novel design and implementation of a chaos engineering system in Java called ChaosMachine . It provides a unique and actionable analysis on exception-handling capabilities in production, at the level of try-catch blocks. To evaluate our approach, we have deployed ChaosMachine on top of 3 large-scale and well-known Java applications totaling 630k 630k lines of code. Our results show that ChaosMachine reveals both strengths and weaknesses of the resilience code of a software system at the level of exception handling.
ISSN:0098-5589
1939-3520
1939-3520
DOI:10.1109/TSE.2019.2954871