Loading…

The PDD Framework for Detecting Categories of Peculiar Data

Peculiar data are objects that are relatively few in number and significantly different from the other objects in a data set. In this paper, we propose the PDD framework for detecting multiple categories of peculiar data. This framework provides an extensible set of perspectives for viewing data, cu...

Full description

Saved in:
Bibliographic Details
Main Authors: Shrestha, M., Hamilton, H.J., Yiyu Yao, Konkel, K., Liqiang Geng
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Peculiar data are objects that are relatively few in number and significantly different from the other objects in a data set. In this paper, we propose the PDD framework for detecting multiple categories of peculiar data. This framework provides an extensible set of perspectives for viewing data, currently including viewing data as a set of records, attributes, frequencies, intervals, sequences, or sequences of changes. By using these six views of the data, multiple categories of peculiar data can be detected to reveal different aspects of the data. For each view, the framework provides an extensible set of peculiarity measures to detect outliers and other kinds of peculiar data. The PDD framework has been implemented for Oracle and Access. Experiments are reported for data sets concerning Regina weather and NHL hockey.
ISSN:1550-4786
2374-8486
DOI:10.1109/ICDM.2006.159