Loading…

Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction

Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively preven...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2023-07
Main Authors: Vu-Duc Ngo, Tuan-Cuong Vuong, Thien Van Luong, Tran, Hung
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Vu-Duc Ngo
Tuan-Cuong Vuong
Thien Van Luong
Tran, Hung
description Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2833815440</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2833815440</sourcerecordid><originalsourceid>FETCH-proquest_journals_28338154403</originalsourceid><addsrcrecordid>eNqNysEKwjAQBNAgCBbtPwQ8F9Ik1eJRbVHQkwWPJdRVW0qqu4n4-RYVz55mmDcDFkil4ijVUo5YSNQIIeRsLpNEBey4N9W1tsB3YNDW9hItDcGJb61DT3Vn-RocVK5vC56DcR6BH6D9TPwBSJ5-kD0dmrdM2PBsWoLwm2M2zbNitYlu2N09kCubzqPtqZSpUmmcaC3Uf68XX6tBxA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2833815440</pqid></control><display><type>article</type><title>Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction</title><source>Publicly Available Content Database</source><creator>Vu-Duc Ngo ; Tuan-Cuong Vuong ; Thien Van Luong ; Tran, Hung</creator><creatorcontrib>Vu-Duc Ngo ; Tuan-Cuong Vuong ; Thien Van Luong ; Tran, Hung</creatorcontrib><description>Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Classification ; Complexity ; Cybersecurity ; Feature extraction ; Feature selection ; Internet of Things ; Intrusion detection systems ; Machine learning ; Performance measurement ; Real time operation</subject><ispartof>arXiv.org, 2023-07</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2833815440?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Vu-Duc Ngo</creatorcontrib><creatorcontrib>Tuan-Cuong Vuong</creatorcontrib><creatorcontrib>Thien Van Luong</creatorcontrib><creatorcontrib>Tran, Hung</creatorcontrib><title>Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction</title><title>arXiv.org</title><description>Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.</description><subject>Classification</subject><subject>Complexity</subject><subject>Cybersecurity</subject><subject>Feature extraction</subject><subject>Feature selection</subject><subject>Internet of Things</subject><subject>Intrusion detection systems</subject><subject>Machine learning</subject><subject>Performance measurement</subject><subject>Real time operation</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNysEKwjAQBNAgCBbtPwQ8F9Ik1eJRbVHQkwWPJdRVW0qqu4n4-RYVz55mmDcDFkil4ijVUo5YSNQIIeRsLpNEBey4N9W1tsB3YNDW9hItDcGJb61DT3Vn-RocVK5vC56DcR6BH6D9TPwBSJ5-kD0dmrdM2PBsWoLwm2M2zbNitYlu2N09kCubzqPtqZSpUmmcaC3Uf68XX6tBxA</recordid><startdate>20230704</startdate><enddate>20230704</enddate><creator>Vu-Duc Ngo</creator><creator>Tuan-Cuong Vuong</creator><creator>Thien Van Luong</creator><creator>Tran, Hung</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230704</creationdate><title>Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction</title><author>Vu-Duc Ngo ; Tuan-Cuong Vuong ; Thien Van Luong ; Tran, Hung</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28338154403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Classification</topic><topic>Complexity</topic><topic>Cybersecurity</topic><topic>Feature extraction</topic><topic>Feature selection</topic><topic>Internet of Things</topic><topic>Intrusion detection systems</topic><topic>Machine learning</topic><topic>Performance measurement</topic><topic>Real time operation</topic><toplevel>online_resources</toplevel><creatorcontrib>Vu-Duc Ngo</creatorcontrib><creatorcontrib>Tuan-Cuong Vuong</creatorcontrib><creatorcontrib>Thien Van Luong</creatorcontrib><creatorcontrib>Tran, Hung</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vu-Duc Ngo</au><au>Tuan-Cuong Vuong</au><au>Thien Van Luong</au><au>Tran, Hung</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction</atitle><jtitle>arXiv.org</jtitle><date>2023-07-04</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2023-07
issn 2331-8422
language eng
recordid cdi_proquest_journals_2833815440
source Publicly Available Content Database
subjects Classification
Complexity
Cybersecurity
Feature extraction
Feature selection
Internet of Things
Intrusion detection systems
Machine learning
Performance measurement
Real time operation
title Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T18%3A19%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Machine%20Learning-Based%20Intrusion%20Detection:%20Feature%20Selection%20versus%20Feature%20Extraction&rft.jtitle=arXiv.org&rft.au=Vu-Duc%20Ngo&rft.date=2023-07-04&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2833815440%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28338154403%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2833815440&rft_id=info:pmid/&rfr_iscdi=true