Loading…
Classification algorithm for class imbalanced data based on optimized Mahalanobis-Taguchi system
Imbalanced data classification is a challenge in data mining and machine learning. To improve the classification performance for imbalanced data, this paper proposes an imbalanced data classification algorithm based on the optimized Mahalanobis-Taguchi system (OMTS). At the feature selection stage,...
Saved in:
Published in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2022-07, Vol.52 (9), p.10674-10691 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c385t-11893e4f2ffa2eb00e87a39f2610402c5c762ea31fb8ced17a0af39910619d403 |
---|---|
cites | cdi_FETCH-LOGICAL-c385t-11893e4f2ffa2eb00e87a39f2610402c5c762ea31fb8ced17a0af39910619d403 |
container_end_page | 10691 |
container_issue | 9 |
container_start_page | 10674 |
container_title | Applied intelligence (Dordrecht, Netherlands) |
container_volume | 52 |
creator | Mao, Ting Zhou, Li Zhang, Yueyi Sun, Yefang |
description | Imbalanced data classification is a challenge in data mining and machine learning. To improve the classification performance for imbalanced data, this paper proposes an imbalanced data classification algorithm based on the optimized Mahalanobis-Taguchi system (OMTS). At the feature selection stage, important feature variables are determined by four principles, namely maximizing mutual information between features and classes, minimizing mutual information between features, maximizing the initial classification accuracy, and selecting features that produce not only the local maximum or minimum of the difference between the mean Mahalanobis distances (MDs) of normal and abnormal samples but also the largest number of features. At the threshold determination stage, using the selected features, particle swarm optimization is used to determine the optimal threshold for classifying normal and abnormal samples according to the principle of maximizing classification accuracy. At the classification and discrimination stage, the samples are divided into two classes according to their MDs and optimal threshold. Experimental results show that OMTS obtains 0.92, 0.95, 0.81, 0.88, and 0.74 in accuracy on the Forest Type Mapping UCI, Fetal Health Classification, Connectionist Bench, Wine Quality, and Oil datasets, respectively, and has better classification performance than other algorithms. |
doi_str_mv | 10.1007/s10489-021-02929-8 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_2678581734</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2678581734</sourcerecordid><originalsourceid>FETCH-LOGICAL-c385t-11893e4f2ffa2eb00e87a39f2610402c5c762ea31fb8ced17a0af39910619d403</originalsourceid><addsrcrecordid>eNp9kE1PwzAMhiMEEuPjD3CqxLngJG2THNHElzTEZUjcgpslW6Z2GUl3GL-ejCJx42DZlp_XTl5CrijcUABxmyhUUpXAaA7FVCmPyITWgpeiUuKYTECxqmwa9X5KzlJaAwDnQCfkY9phSt55g4MPmwK7ZYh-WPWFC7Ewh2Hh-xY73Bi7KBY4YNFiymWGw3bwvf_KzQuuDkhofSrnuNyZlS_SPg22vyAnDrtkL3_zOXl7uJ9Pn8rZ6-Pz9G5WGi7roaRUKm4rx5xDZlsAKwVy5ViTfwbM1EY0zCKnrpX5IVQgoONKUWioWlTAz8n1uHcbw-fOpkGvwy5u8knNGiFrSQWvMsVGysSQUrROb6PvMe41BX1wUo9O6uyk_nFSyyzioyhleLO08W_1P6pvZKx3Yg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2678581734</pqid></control><display><type>article</type><title>Classification algorithm for class imbalanced data based on optimized Mahalanobis-Taguchi system</title><source>ABI/INFORM Global</source><source>Springer Nature</source><creator>Mao, Ting ; Zhou, Li ; Zhang, Yueyi ; Sun, Yefang</creator><creatorcontrib>Mao, Ting ; Zhou, Li ; Zhang, Yueyi ; Sun, Yefang</creatorcontrib><description>Imbalanced data classification is a challenge in data mining and machine learning. To improve the classification performance for imbalanced data, this paper proposes an imbalanced data classification algorithm based on the optimized Mahalanobis-Taguchi system (OMTS). At the feature selection stage, important feature variables are determined by four principles, namely maximizing mutual information between features and classes, minimizing mutual information between features, maximizing the initial classification accuracy, and selecting features that produce not only the local maximum or minimum of the difference between the mean Mahalanobis distances (MDs) of normal and abnormal samples but also the largest number of features. At the threshold determination stage, using the selected features, particle swarm optimization is used to determine the optimal threshold for classifying normal and abnormal samples according to the principle of maximizing classification accuracy. At the classification and discrimination stage, the samples are divided into two classes according to their MDs and optimal threshold. Experimental results show that OMTS obtains 0.92, 0.95, 0.81, 0.88, and 0.74 in accuracy on the Forest Type Mapping UCI, Fetal Health Classification, Connectionist Bench, Wine Quality, and Oil datasets, respectively, and has better classification performance than other algorithms.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-021-02929-8</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Accuracy ; Algorithms ; Artificial Intelligence ; Classification ; Computer Science ; Data mining ; Machine learning ; Machines ; Manufacturing ; Maximization ; Mechanical Engineering ; Particle swarm optimization ; Principles ; Processes ; Taguchi methods</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2022-07, Vol.52 (9), p.10674-10691</ispartof><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021</rights><rights>The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c385t-11893e4f2ffa2eb00e87a39f2610402c5c762ea31fb8ced17a0af39910619d403</citedby><cites>FETCH-LOGICAL-c385t-11893e4f2ffa2eb00e87a39f2610402c5c762ea31fb8ced17a0af39910619d403</cites><orcidid>0000-0002-8627-2124</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2678581734/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2678581734?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,74895</link.rule.ids></links><search><creatorcontrib>Mao, Ting</creatorcontrib><creatorcontrib>Zhou, Li</creatorcontrib><creatorcontrib>Zhang, Yueyi</creatorcontrib><creatorcontrib>Sun, Yefang</creatorcontrib><title>Classification algorithm for class imbalanced data based on optimized Mahalanobis-Taguchi system</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>Imbalanced data classification is a challenge in data mining and machine learning. To improve the classification performance for imbalanced data, this paper proposes an imbalanced data classification algorithm based on the optimized Mahalanobis-Taguchi system (OMTS). At the feature selection stage, important feature variables are determined by four principles, namely maximizing mutual information between features and classes, minimizing mutual information between features, maximizing the initial classification accuracy, and selecting features that produce not only the local maximum or minimum of the difference between the mean Mahalanobis distances (MDs) of normal and abnormal samples but also the largest number of features. At the threshold determination stage, using the selected features, particle swarm optimization is used to determine the optimal threshold for classifying normal and abnormal samples according to the principle of maximizing classification accuracy. At the classification and discrimination stage, the samples are divided into two classes according to their MDs and optimal threshold. Experimental results show that OMTS obtains 0.92, 0.95, 0.81, 0.88, and 0.74 in accuracy on the Forest Type Mapping UCI, Fetal Health Classification, Connectionist Bench, Wine Quality, and Oil datasets, respectively, and has better classification performance than other algorithms.</description><subject>Accuracy</subject><subject>Algorithms</subject><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Computer Science</subject><subject>Data mining</subject><subject>Machine learning</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Maximization</subject><subject>Mechanical Engineering</subject><subject>Particle swarm optimization</subject><subject>Principles</subject><subject>Processes</subject><subject>Taguchi methods</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp9kE1PwzAMhiMEEuPjD3CqxLngJG2THNHElzTEZUjcgpslW6Z2GUl3GL-ejCJx42DZlp_XTl5CrijcUABxmyhUUpXAaA7FVCmPyITWgpeiUuKYTECxqmwa9X5KzlJaAwDnQCfkY9phSt55g4MPmwK7ZYh-WPWFC7Ewh2Hh-xY73Bi7KBY4YNFiymWGw3bwvf_KzQuuDkhofSrnuNyZlS_SPg22vyAnDrtkL3_zOXl7uJ9Pn8rZ6-Pz9G5WGi7roaRUKm4rx5xDZlsAKwVy5ViTfwbM1EY0zCKnrpX5IVQgoONKUWioWlTAz8n1uHcbw-fOpkGvwy5u8knNGiFrSQWvMsVGysSQUrROb6PvMe41BX1wUo9O6uyk_nFSyyzioyhleLO08W_1P6pvZKx3Yg</recordid><startdate>20220701</startdate><enddate>20220701</enddate><creator>Mao, Ting</creator><creator>Zhou, Li</creator><creator>Zhang, Yueyi</creator><creator>Sun, Yefang</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-8627-2124</orcidid></search><sort><creationdate>20220701</creationdate><title>Classification algorithm for class imbalanced data based on optimized Mahalanobis-Taguchi system</title><author>Mao, Ting ; Zhou, Li ; Zhang, Yueyi ; Sun, Yefang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c385t-11893e4f2ffa2eb00e87a39f2610402c5c762ea31fb8ced17a0af39910619d403</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Accuracy</topic><topic>Algorithms</topic><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Computer Science</topic><topic>Data mining</topic><topic>Machine learning</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Maximization</topic><topic>Mechanical Engineering</topic><topic>Particle swarm optimization</topic><topic>Principles</topic><topic>Processes</topic><topic>Taguchi methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Mao, Ting</creatorcontrib><creatorcontrib>Zhou, Li</creatorcontrib><creatorcontrib>Zhang, Yueyi</creatorcontrib><creatorcontrib>Sun, Yefang</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Psychology</collection><collection>Engineering Collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Mao, Ting</au><au>Zhou, Li</au><au>Zhang, Yueyi</au><au>Sun, Yefang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Classification algorithm for class imbalanced data based on optimized Mahalanobis-Taguchi system</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2022-07-01</date><risdate>2022</risdate><volume>52</volume><issue>9</issue><spage>10674</spage><epage>10691</epage><pages>10674-10691</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>Imbalanced data classification is a challenge in data mining and machine learning. To improve the classification performance for imbalanced data, this paper proposes an imbalanced data classification algorithm based on the optimized Mahalanobis-Taguchi system (OMTS). At the feature selection stage, important feature variables are determined by four principles, namely maximizing mutual information between features and classes, minimizing mutual information between features, maximizing the initial classification accuracy, and selecting features that produce not only the local maximum or minimum of the difference between the mean Mahalanobis distances (MDs) of normal and abnormal samples but also the largest number of features. At the threshold determination stage, using the selected features, particle swarm optimization is used to determine the optimal threshold for classifying normal and abnormal samples according to the principle of maximizing classification accuracy. At the classification and discrimination stage, the samples are divided into two classes according to their MDs and optimal threshold. Experimental results show that OMTS obtains 0.92, 0.95, 0.81, 0.88, and 0.74 in accuracy on the Forest Type Mapping UCI, Fetal Health Classification, Connectionist Bench, Wine Quality, and Oil datasets, respectively, and has better classification performance than other algorithms.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-021-02929-8</doi><tpages>18</tpages><orcidid>https://orcid.org/0000-0002-8627-2124</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0924-669X |
ispartof | Applied intelligence (Dordrecht, Netherlands), 2022-07, Vol.52 (9), p.10674-10691 |
issn | 0924-669X 1573-7497 |
language | eng |
recordid | cdi_proquest_journals_2678581734 |
source | ABI/INFORM Global; Springer Nature |
subjects | Accuracy Algorithms Artificial Intelligence Classification Computer Science Data mining Machine learning Machines Manufacturing Maximization Mechanical Engineering Particle swarm optimization Principles Processes Taguchi methods |
title | Classification algorithm for class imbalanced data based on optimized Mahalanobis-Taguchi system |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T23%3A29%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Classification%20algorithm%20for%20class%20imbalanced%20data%20based%20on%20optimized%20Mahalanobis-Taguchi%20system&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Mao,%20Ting&rft.date=2022-07-01&rft.volume=52&rft.issue=9&rft.spage=10674&rft.epage=10691&rft.pages=10674-10691&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-021-02929-8&rft_dat=%3Cproquest_cross%3E2678581734%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c385t-11893e4f2ffa2eb00e87a39f2610402c5c762ea31fb8ced17a0af39910619d403%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2678581734&rft_id=info:pmid/&rfr_iscdi=true |