Loading…

Intrusion detection using Highest Wins feature selection algorithm

The rapid advancement of Internet stimulates building intelligent data mining systems for detecting intrusion attacks. The performance of such systems might be negatively affected due to the big datasets employed in the learning phase. Determining the appropriate group of features within training da...

Full description

Saved in:
Bibliographic Details
Published in:Neural computing & applications 2021-08, Vol.33 (16), p.9805-9816
Main Authors: Mohammad, Rami Mustafa A., Alsmadi, Mutasem K.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The rapid advancement of Internet stimulates building intelligent data mining systems for detecting intrusion attacks. The performance of such systems might be negatively affected due to the big datasets employed in the learning phase. Determining the appropriate group of features within training datasets is an essential phase when building data mining classification models. Nevertheless, the resulted minimized set of features should maintain or even improve the performance of the classification models. Throughout this article, an innovative feature selection algorithm is proposed and is called “the Highest Wins” ( HW ). To evaluate the generalization ability of HW , it has been applied for creating classification models using naïve Bayes technique from 10 benchmark datasets. The obtained results were compared against two well-known strategies, namely chi-square and information gain. The experimental results confirmed the competitiveness ability of the suggested strategy in terms of various evaluation measurements such as recall, precision, and error rate while significantly decreasing the number of selected features in datasets. Further, the HW is used for building a naïve Bayes and decision tree intrusion detection classifiers using the well-known dataset from Network Security Laboratory-Knowledge Discovery in Databases (NSL-KDD). The results were promising not just in terms of overall performance, but also in terms of the time needed to build the classification model.
ISSN:0941-0643
1433-3058
DOI:10.1007/s00521-021-05745-w