Loading…
Intrusion detection using Highest Wins feature selection algorithm
The rapid advancement of Internet stimulates building intelligent data mining systems for detecting intrusion attacks. The performance of such systems might be negatively affected due to the big datasets employed in the learning phase. Determining the appropriate group of features within training da...
Saved in:
Published in: | Neural computing & applications 2021-08, Vol.33 (16), p.9805-9816 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The rapid advancement of Internet stimulates building intelligent data mining systems for detecting intrusion attacks. The performance of such systems might be negatively affected due to the big datasets employed in the learning phase. Determining the appropriate group of features within training datasets is an essential phase when building data mining classification models. Nevertheless, the resulted minimized set of features should maintain or even improve the performance of the classification models. Throughout this article, an innovative feature selection algorithm is proposed and is called “the Highest Wins” (
HW
). To evaluate the generalization ability of
HW
, it has been applied for creating classification models using naïve Bayes technique from 10 benchmark datasets. The obtained results were compared against two well-known strategies, namely chi-square and information gain. The experimental results confirmed the competitiveness ability of the suggested strategy in terms of various evaluation measurements such as recall, precision, and error rate while significantly decreasing the number of selected features in datasets. Further, the
HW
is used for building a naïve Bayes and decision tree intrusion detection classifiers using the well-known dataset from Network Security Laboratory-Knowledge Discovery in Databases (NSL-KDD). The results were promising not just in terms of overall performance, but also in terms of the time needed to build the classification model. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-021-05745-w |