Loading…

Turning CARTwheels: An Alternating Algorithm for Mining Redescriptions

We present an unusual algorithm involving classification trees where two trees are grown in opposite directions so that they are matched at their leaves. This approach finds application in a new data mining task we formulate, called "redescription mining". A redescription is a shift-of-voc...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2003-11
Main Authors: Kumar, Deept, Ramakrishnan, Naren, Potts, Malcolm, Helm, Richard F
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We present an unusual algorithm involving classification trees where two trees are grown in opposite directions so that they are matched at their leaves. This approach finds application in a new data mining task we formulate, called "redescription mining". A redescription is a shift-of-vocabulary, or a different way of communicating information about a given subset of data; the goal of redescription mining is to find subsets of data that afford multiple descriptions. We highlight the importance of this problem in domains such as bioinformatics, which exhibit an underlying richness and diversity of data descriptors (e.g., genes can be studied in a variety of ways). Our approach helps integrate multiple forms of characterizing datasets, situates the knowledge gained from one dataset in the context of others, and harnesses high-level abstractions for uncovering cryptic and subtle features of data. Algorithm design decisions, implementation details, and experimental results are presented.
ISSN:2331-8422