Loading…
Supervised Discretization with GK − τ
When data are high dimensional and mix-typed while response variable is categorical, an effective executable profile consists of categorical or categorized variables with easily understandable statistics. Many data mining technologies require categor- ical variables; many have better results by chan...
Saved in:
Published in: | Procedia computer science 2013, Vol.17, p.114-120 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | When data are high dimensional and mix-typed while response variable is categorical, an effective executable profile consists of categorical or categorized variables with easily understandable statistics. Many data mining technologies require categor- ical variables; many have better results by changing continuous variables to categorical variables. Discretizing a continuous variable can be accomplished in either a supervised way or an unsupervised or conventional way. We propose a supervised discretizing method using the Goodman-Kruskal tau (or GK-τ) maximization as the discretization optimization criterion. This optimization is probabilistic averaging effect oriented. An experiment with financial loan application is designed to show the improvement after the discretization. Some technical concerns during the discretization are discussed in this article as well. |
---|---|
ISSN: | 1877-0509 1877-0509 |
DOI: | 10.1016/j.procs.2013.05.016 |