Loading…
Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction
Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively preven...
Saved in:
Published in: | arXiv.org 2023-07 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Vu-Duc Ngo Tuan-Cuong Vuong Thien Van Luong Tran, Hung |
description | Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2833815440</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2833815440</sourcerecordid><originalsourceid>FETCH-proquest_journals_28338154403</originalsourceid><addsrcrecordid>eNqNysEKwjAQBNAgCBbtPwQ8F9Ik1eJRbVHQkwWPJdRVW0qqu4n4-RYVz55mmDcDFkil4ijVUo5YSNQIIeRsLpNEBey4N9W1tsB3YNDW9hItDcGJb61DT3Vn-RocVK5vC56DcR6BH6D9TPwBSJ5-kD0dmrdM2PBsWoLwm2M2zbNitYlu2N09kCubzqPtqZSpUmmcaC3Uf68XX6tBxA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2833815440</pqid></control><display><type>article</type><title>Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction</title><source>Publicly Available Content Database</source><creator>Vu-Duc Ngo ; Tuan-Cuong Vuong ; Thien Van Luong ; Tran, Hung</creator><creatorcontrib>Vu-Duc Ngo ; Tuan-Cuong Vuong ; Thien Van Luong ; Tran, Hung</creatorcontrib><description>Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Classification ; Complexity ; Cybersecurity ; Feature extraction ; Feature selection ; Internet of Things ; Intrusion detection systems ; Machine learning ; Performance measurement ; Real time operation</subject><ispartof>arXiv.org, 2023-07</ispartof><rights>2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2833815440?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Vu-Duc Ngo</creatorcontrib><creatorcontrib>Tuan-Cuong Vuong</creatorcontrib><creatorcontrib>Thien Van Luong</creatorcontrib><creatorcontrib>Tran, Hung</creatorcontrib><title>Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction</title><title>arXiv.org</title><description>Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.</description><subject>Classification</subject><subject>Complexity</subject><subject>Cybersecurity</subject><subject>Feature extraction</subject><subject>Feature selection</subject><subject>Internet of Things</subject><subject>Intrusion detection systems</subject><subject>Machine learning</subject><subject>Performance measurement</subject><subject>Real time operation</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNysEKwjAQBNAgCBbtPwQ8F9Ik1eJRbVHQkwWPJdRVW0qqu4n4-RYVz55mmDcDFkil4ijVUo5YSNQIIeRsLpNEBey4N9W1tsB3YNDW9hItDcGJb61DT3Vn-RocVK5vC56DcR6BH6D9TPwBSJ5-kD0dmrdM2PBsWoLwm2M2zbNitYlu2N09kCubzqPtqZSpUmmcaC3Uf68XX6tBxA</recordid><startdate>20230704</startdate><enddate>20230704</enddate><creator>Vu-Duc Ngo</creator><creator>Tuan-Cuong Vuong</creator><creator>Thien Van Luong</creator><creator>Tran, Hung</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20230704</creationdate><title>Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction</title><author>Vu-Duc Ngo ; Tuan-Cuong Vuong ; Thien Van Luong ; Tran, Hung</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_28338154403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Classification</topic><topic>Complexity</topic><topic>Cybersecurity</topic><topic>Feature extraction</topic><topic>Feature selection</topic><topic>Internet of Things</topic><topic>Intrusion detection systems</topic><topic>Machine learning</topic><topic>Performance measurement</topic><topic>Real time operation</topic><toplevel>online_resources</toplevel><creatorcontrib>Vu-Duc Ngo</creatorcontrib><creatorcontrib>Tuan-Cuong Vuong</creatorcontrib><creatorcontrib>Thien Van Luong</creatorcontrib><creatorcontrib>Tran, Hung</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Vu-Duc Ngo</au><au>Tuan-Cuong Vuong</au><au>Thien Van Luong</au><au>Tran, Hung</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction</atitle><jtitle>arXiv.org</jtitle><date>2023-07-04</date><risdate>2023</risdate><eissn>2331-8422</eissn><abstract>Internet of things (IoT) has been playing an important role in many sectors, such as smart cities, smart agriculture, smart healthcare, and smart manufacturing. However, IoT devices are highly vulnerable to cyber-attacks, which may result in security breaches and data leakages. To effectively prevent these attacks, a variety of machine learning-based network intrusion detection methods for IoT networks have been developed, which often rely on either feature extraction or feature selection techniques for reducing the dimension of input data before being fed into machine learning models. This aims to make the detection complexity low enough for real-time operations, which is particularly vital in any intrusion detection systems. This paper provides a comprehensive comparison between these two feature reduction methods of intrusion detection in terms of various performance metrics, namely, precision rate, recall rate, detection accuracy, as well as runtime complexity, in the presence of the modern UNSW-NB15 dataset as well as both binary and multiclass classification. For example, in general, the feature selection method not only provides better detection performance but also lower training and inference time compared to its feature extraction counterpart, especially when the number of reduced features K increases. However, the feature extraction method is much more reliable than its selection counterpart, particularly when K is very small, such as K = 4. Additionally, feature extraction is less sensitive to changing the number of reduced features K than feature selection, and this holds true for both binary and multiclass classifications. Based on this comparison, we provide a useful guideline for selecting a suitable intrusion detection type for each specific scenario, as detailed in Tab. 14 at the end of Section IV.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2023-07 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2833815440 |
source | Publicly Available Content Database |
subjects | Classification Complexity Cybersecurity Feature extraction Feature selection Internet of Things Intrusion detection systems Machine learning Performance measurement Real time operation |
title | Machine Learning-Based Intrusion Detection: Feature Selection versus Feature Extraction |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-24T18%3A19%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=Machine%20Learning-Based%20Intrusion%20Detection:%20Feature%20Selection%20versus%20Feature%20Extraction&rft.jtitle=arXiv.org&rft.au=Vu-Duc%20Ngo&rft.date=2023-07-04&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E2833815440%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_28338154403%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2833815440&rft_id=info:pmid/&rfr_iscdi=true |