Loading…
CauseInfer: Automatic and distributed performance diagnosis with hierarchical causality graph in large distributed systems
Modern applications especially cloud-based or cloud-centric applications always have many components running in the large distributed environment with complex interactions. They are vulnerable to suffer from performance or availability problems due to the highly dynamic runtime environment such as r...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Modern applications especially cloud-based or cloud-centric applications always have many components running in the large distributed environment with complex interactions. They are vulnerable to suffer from performance or availability problems due to the highly dynamic runtime environment such as resource hogs, configuration changes and software bugs. In order to make efficient software maintenance and provide some hints to software bugs, we build a system named CauseInfer, a low cost and blackbox cause inference system without instrumenting the application source code. CauseInfer can automatically construct a two layered hierarchical causality graph and infer the causes of performance problems along the causal paths in the graph with a series of statistical methods. According to the experimental evaluation in the controlled environment, we find out CauseInfer can achieve an average 80% precision and 85% recall in a list of top two causes to identify the root causes, higher than several state-of-the-art methods and a good scalability to scale up in the distributed systems. |
---|---|
ISSN: | 0743-166X 2641-9874 |
DOI: | 10.1109/INFOCOM.2014.6848128 |