Loading…

Integrating Swarm Intelligence and Statistical Data for Feature Selection in Text Categorization

Feature selection is the principal step in classification problems with attributes of high dimension. It may also be considered as a problem to determine the subset of terms in training corpus, which maximizes the classifier's performance. Most of the machine learning algorithms has tainted per...

Full description

Saved in:
Bibliographic Details
Published in:International journal of computer applications 2010-02, Vol.1 (11), p.16-21
Main Authors: Meena, M Janaki, Chandran, K R, Brinda, J Mary
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Feature selection is the principal step in classification problems with attributes of high dimension. It may also be considered as a problem to determine the subset of terms in training corpus, which maximizes the classifier's performance. Most of the machine learning algorithms has tainted performance in high dimensional feature space. In this paper, a novel feature selection method based on Ant Colony Optimization, a swarm intelligence algorithm is proposed. Ant Colony Optimization is a metaheuristic algorithm used to increase the ability of finding high quality solutions to NP-hard problems. The heuristic information required for the optimization process is obtained through a chi-square based statistical method, CHIR which results in fast convergence. Performance of the classifier with features selected by proposed method is compared to the feature selected by conventional chi-square and CHIR methods. It is found that the proposed algorithm identifies better feature set than the conventional methods.
ISSN:0975-8887
0975-8887
DOI:10.5120/248-405