Loading…
Efficient hybrid optimization based feature selection and classification on high dimensional dataset
With the vast usage of intelligent information systems (ISs), the tremendous increase in data volume creates numerous problems and challenges, such as high dimensionality, noisy and irrelevant data. These issues lead to high computational costs and greatly affect the accuracy and efficiency of machi...
Saved in:
Published in: | Multimedia tools and applications 2023-12, Vol.83 (20), p.58689-58727 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | With the vast usage of intelligent information systems (ISs), the tremendous increase in data volume creates numerous problems and challenges, such as high dimensionality, noisy and irrelevant data. These issues lead to high computational costs and greatly affect the accuracy and efficiency of machine learning (ML) algorithms. Feature selection (FS) is one of the most important concepts used effectively to boost the classification’s accuracy and minimize computational costs. However, finding an effective FS approach is challenging, and numerous swarm-based algorithms inspired by biological systems have been developed. Feature selection aims to determine the best subset of features for categorizing the class labels by eliminating irrelevant data. This paper introduces the hybrid optimization approach to solve the problems in the feature selection process. The input data is obtained from several datasets, and the data cleaning is performed in the pre-processing stage. Initially, eight different optimization techniques are executed and depending on the results attained from performance metrics, the best two algorithms are selected. The selected best algorithms are combined together to generate a hybrid process. The proposed work hybridizes a Slime Mould Algorithm (SMA) with Binary Grey Wolf Optimization (BGWO) for feature selection. The selected features from the hybrid algorithms are fed to the K-nearest neighbor (KNN) classifier, which is analyzed to be effective compared to the other classifier. Finally, the hybrid SMA + BGWO based feature selection with the KNN classifier effectively solves the FS problems on high dimensional data with remarkable accuracy and convergence speed. Performance metrics like accuracy, precision, F-measure, computational time, recall, RMSE and MAE are utilized to evaluate the efficacy of classifiers. The proposed SMA + BGWO approach with KNN classification in the CICDDoS2019 dataset attains an accuracy of 99.83%, and CICMalDroid2020 attains an accuracy of 99.30%. The experimental analysis proves that the proposed hybrid technique is better than the existing techniques. |
---|---|
ISSN: | 1573-7721 1380-7501 1573-7721 |
DOI: | 10.1007/s11042-023-17724-5 |