Loading…

Feature selection from high dimensional data based on iterative qualitative mutual information

High Dimensional cancer microarray is devilishly challenging while finding the best features for classification. In this paper a new algorithm is proposed based on iterative qualitative mutual information to choose the features that can provide optimal feature set with reliability, stability, and be...

Full description

Saved in:
Bibliographic Details
Published in:Journal of intelligent & fuzzy systems 2019-01, Vol.36 (6), p.5845-5856
Main Authors: Nagpal, Arpita, Singh, Vijendra
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:High Dimensional cancer microarray is devilishly challenging while finding the best features for classification. In this paper a new algorithm is proposed based on iterative qualitative mutual information to choose the features that can provide optimal feature set with reliability, stability, and best classification results. It finds the qualitative (i.e. utility) score of each feature with the help of Random Forest algorithm and combines it with mutual information of each feature with its class variable. Adding a qualitative measure along with mutual information can improve the robustness and find redundant features in data. The proposed algorithm has been compared with other representative methods through the ten microarray based cancer datasets in terms of number of features and classification accuracy of three well-known classifiers: Naïve Bayes, IB1 and C4.5. Experimental results show that the proposed approach is effective in producing an optimal feature subset and improves the accuracy of these datasets.
ISSN:1064-1246
1875-8967
DOI:10.3233/JIFS-181665