Loading…
MEMOD: a novel multivariate evolutionary multi-objective discretization
Discretization is an important preprocessing technique, especially in classification problems. It reduces and simplifies data, accelerates the learning process, and improves learner performance. The most challenging aspect of the discretization process is to maintain the accuracy of the classificati...
Saved in:
Published in: | Soft computing (Berlin, Germany) Germany), 2018, Vol.22 (1), p.301-323 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Discretization is an important preprocessing technique, especially in classification problems. It reduces and simplifies data, accelerates the learning process, and improves learner performance. The most challenging aspect of the discretization process is to maintain the accuracy of the classification algorithm and to prevent information loss while reducing the number of discretized values. In this paper, using evolutionary multi-objective optimization, classification error (the first objective function) and number of cut points (the second objective function) are simultaneously reduced. The third objective function involves selecting low-frequency cut points so that a smaller degree of information is lost during this conversion (from continuous to discrete). To the best of our knowledge, this is the first paper to consider the discretization process as a multi-objective optimization problem. Previous discretization methods result in only one solution. However, in real-world problems, decision makers often need several alternatives to make better decisions—a requirement which cannot be fulfilled using these techniques. The multi-objective nature of the proposed algorithm enables the generation of numerous solutions (i.e., the Pareto front) allowing the user to select the most appropriate solution according to the nuances of the problem. A total of 20 benchmark data sets were used to test the performance of the proposed algorithm. Our results show that the proposed algorithm offers superior performance compared to other methods in the literature. Thus, it presents better discretization in classification problems. |
---|---|
ISSN: | 1432-7643 1433-7479 |
DOI: | 10.1007/s00500-016-2475-5 |