Loading…
Efficient attribute reduction from the viewpoint of discernibility
Attribute reduction is an important preprocessing step in pattern recognition, machine learning and data mining. As an effective method for attribute reduction, rough set theory offers a useful and formal methodology. It retains the discernibility power of the original datasets; thus, attribute redu...
Saved in:
Published in: | Information sciences 2016-01, Vol.326, p.297-314 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Attribute reduction is an important preprocessing step in pattern recognition, machine learning and data mining. As an effective method for attribute reduction, rough set theory offers a useful and formal methodology. It retains the discernibility power of the original datasets; thus, attribute reduction has been extensively studied in rough set theory. However, the inefficiency of the existing attribute reduction algorithms limits the application of rough sets. In this paper, we first analyse the limitations of existing attribute reduction algorithms. Then, a novel measure of attribute quality, called the relative discernibility degree, is proposed based on the discernibility. Theoretical analysis shows that this measure can find relative dispensable attributes and remain unchanged after removing the relative dispensable attributes and redundant objects in the process of selecting attributes. This property can be used to reduce the search space and accelerate the heuristic process of attribute reduction. Consequently, a new attribute reduction algorithm is proposed from the viewpoint of discernibility. Furthermore, the relationships among the reduction definitions of the algebra view, information view and discernibility view are derived. Some non-equivalent relationships among these views of rough set theory in inconsistent decision tables are discovered. A set of numerical experiments was conducted on UCI datasets. Experimental results show that the proposed algorithm is effective and efficient and is applicable to the case of large-scale datasets. |
---|---|
ISSN: | 0020-0255 1872-6291 |
DOI: | 10.1016/j.ins.2015.07.052 |