Loading…

Evaluation of the Performance of Unsupervised Learning Algorithms for Intrusion Detection in Unbalanced Data Environments

This study evaluated the performance of unsupervised machine learning algorithms for intrusion detection in unbalanced data environments using the BoT-IoT dataset. Algorithms such as K-means++, DBSCAN, Local Outlier Factor (LOF), and Isolation Forest (I-forest) were analyzed using metrics like purit...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2024, Vol.12, p.190134-190157
Main Authors: Fernando, Gutierrez-Portela, Florina, Almenares Mendoza, Liliana, Calderon-Benavides
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This study evaluated the performance of unsupervised machine learning algorithms for intrusion detection in unbalanced data environments using the BoT-IoT dataset. Algorithms such as K-means++, DBSCAN, Local Outlier Factor (LOF), and Isolation Forest (I-forest) were analyzed using metrics like purity, homogeneity, completeness, V-measure, and adjusted mutual information to assess their effectiveness in detecting attacks such as DDoS, DoS, and reconnaissance. Optimal cluster selection methods were also explored, and principal component analysis (PCA) was applied to explain data variability. Results showed that K-means++ achieved 95% purity with 95% and 99% prediction accuracies for normal and abnormal data, respectively, while I-forest delivered similar results and excelled in computational efficiency, consuming only 10% of CPU resources compared to 16% for other algorithms. These findings highlight I-forest's effectiveness and efficiency in intrusion detection, offering a viable solution for cybersecurity environments with limited resources and significant data imbalance.
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2024.3516615