Loading…

MEMOD: a novel multivariate evolutionary multi-objective discretization

Discretization is an important preprocessing technique, especially in classification problems. It reduces and simplifies data, accelerates the learning process, and improves learner performance. The most challenging aspect of the discretization process is to maintain the accuracy of the classificati...

Full description

Saved in:
Bibliographic Details
Published in:Soft computing (Berlin, Germany) Germany), 2018, Vol.22 (1), p.301-323
Main Authors: Tahan, Marzieh Hajizadeh, Asadi, Shahrokh
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Discretization is an important preprocessing technique, especially in classification problems. It reduces and simplifies data, accelerates the learning process, and improves learner performance. The most challenging aspect of the discretization process is to maintain the accuracy of the classification algorithm and to prevent information loss while reducing the number of discretized values. In this paper, using evolutionary multi-objective optimization, classification error (the first objective function) and number of cut points (the second objective function) are simultaneously reduced. The third objective function involves selecting low-frequency cut points so that a smaller degree of information is lost during this conversion (from continuous to discrete). To the best of our knowledge, this is the first paper to consider the discretization process as a multi-objective optimization problem. Previous discretization methods result in only one solution. However, in real-world problems, decision makers often need several alternatives to make better decisions—a requirement which cannot be fulfilled using these techniques. The multi-objective nature of the proposed algorithm enables the generation of numerous solutions (i.e., the Pareto front) allowing the user to select the most appropriate solution according to the nuances of the problem. A total of 20 benchmark data sets were used to test the performance of the proposed algorithm. Our results show that the proposed algorithm offers superior performance compared to other methods in the literature. Thus, it presents better discretization in classification problems.
ISSN:1432-7643
1433-7479
DOI:10.1007/s00500-016-2475-5