Loading…

A Scalable, Non-Parametric Method for Detecting Performance Anomaly in Large Scale Computing

As computer systems continue to grow in scale and complexity, performance problems become common and a major concern for large-scale computing. Performance anomalies caused by application bugs, hardware or software faults, or resource contention can have great impact on system-wide performance and c...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems 2016-07, Vol.27 (7), p.1902-1914
Main Authors: Yu, Li, Lan, Zhiling
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:As computer systems continue to grow in scale and complexity, performance problems become common and a major concern for large-scale computing. Performance anomalies caused by application bugs, hardware or software faults, or resource contention can have great impact on system-wide performance and could lead to significant economic losses for service providers. While many detection methods have been presented in the past, the newly emerging challenges are detection scalability and practical use. In this paper, we propose a scalable, non-parametric method for effectively detecting performance anomalies in large-scale systems. The design is generic for anomaly detection in a variety of parallel and distributed systems exhibiting peer-comparable property. It adopts a divide-and-conquer approach to address the scalability challenge and explores the use of non-parametric clustering and two-phase majority voting to improve detection flexibility and accuracy. We derive probabilistic models to quantitatively evaluate our decentralized design. Experiments with a suite of applications on production systems demonstrate that this method outperforms existing methods in terms of detection accuracy with a negligible runtime overhead.
ISSN:1045-9219
1558-2183
DOI:10.1109/TPDS.2015.2475741