Loading…

Feature selection for set-valued data based on D–S evidence theory

Feature selection is one basic and critical technology for data mining, especially in current “big data era”. Rough set theory is sensitive to noise in feature selection due the stringent condition of an equivalence relation. However, D–S evidence theory is flexible to measure uncertainty of informa...

Full description

Saved in:
Bibliographic Details
Published in:The Artificial intelligence review 2023-03, Vol.56 (3), p.2667-2696
Main Authors: Wang, Yini, Wang, Sichun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c288t-cab63347561cf1902e9ed7b8de135905abdad05058a4eed65383466214abc99c3
cites cdi_FETCH-LOGICAL-c288t-cab63347561cf1902e9ed7b8de135905abdad05058a4eed65383466214abc99c3
container_end_page 2696
container_issue 3
container_start_page 2667
container_title The Artificial intelligence review
container_volume 56
creator Wang, Yini
Wang, Sichun
description Feature selection is one basic and critical technology for data mining, especially in current “big data era”. Rough set theory is sensitive to noise in feature selection due the stringent condition of an equivalence relation. However, D–S evidence theory is flexible to measure uncertainty of information. In this paper, we introduce robust feature evaluation metrics “belief function” and “plausibility function” into feature selection algorithm to avoid the defect that classification effect is affected by noise such as missing values, confusing data, etc. Firstly, similarity between information values in a set-valued information system (SVIS) is introduced and a variable parameter to control the similarity of samples is given. Secondly, θ -lower and θ -upper approximations in an SVIS are put forward. Then, the concepts of θ -belief function, θ -plausibility function, θ -belief reduction and θ -plausibility reduction are given. Moreover, several feature selection algorithms based on the D–S evidence theory in an SVIS are proposed. Experimental results and statistical test show that the proposed metric is insensitive to noise because it comprehensively considers the evidence at all levels, and the proposed algorithms are more robust than several state-of-the-art feature selection algorithms.
doi_str_mv 10.1007/s10462-022-10241-1
format article
fullrecord <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2777522719</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A737757517</galeid><sourcerecordid>A737757517</sourcerecordid><originalsourceid>FETCH-LOGICAL-c288t-cab63347561cf1902e9ed7b8de135905abdad05058a4eed65383466214abc99c3</originalsourceid><addsrcrecordid>eNp9kE1OwzAQhS0EEqVwAVaRWBv8E8fJsmopIFViAawtx56UVGlcbKdSd9yBG3ISDEFih7wYz8z7ZkYPoUtKrikh8iZQkhcME8YwJSynmB6hCRWSY5nqx2hCWFFhVjJ6is5C2BBCBMv5BC2WoOPgIQvQgYmt67PG-ZRFvNfdADazOuqs1iF9U3Px-f7xlMG-tdAbyOIrOH84RyeN7gJc_MYpelnePs_v8erx7mE-W2HDyjJio-uC81yKgpqGVoRBBVbWpQXKRUWErq22RBBR6hzAFoKXPC8KRnNdm6oyfIquxrk7794GCFFt3OD7tFIxKaVgTNIqqa5H1Vp3oNq-cdFrk56FbWtcD02b6jPJEyEFlQlgI2C8C8FDo3a-3Wp_UJSob3vVaK9K9qofexVNEB-hkMT9GvzfLf9QX_iCfF4</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2777522719</pqid></control><display><type>article</type><title>Feature selection for set-valued data based on D–S evidence theory</title><source>Library &amp; Information Science Abstracts (LISA)</source><source>Social Science Premium Collection</source><source>ABI/INFORM Global</source><source>Library &amp; Information Science Collection</source><source>Springer Link</source><creator>Wang, Yini ; Wang, Sichun</creator><creatorcontrib>Wang, Yini ; Wang, Sichun</creatorcontrib><description>Feature selection is one basic and critical technology for data mining, especially in current “big data era”. Rough set theory is sensitive to noise in feature selection due the stringent condition of an equivalence relation. However, D–S evidence theory is flexible to measure uncertainty of information. In this paper, we introduce robust feature evaluation metrics “belief function” and “plausibility function” into feature selection algorithm to avoid the defect that classification effect is affected by noise such as missing values, confusing data, etc. Firstly, similarity between information values in a set-valued information system (SVIS) is introduced and a variable parameter to control the similarity of samples is given. Secondly, θ -lower and θ -upper approximations in an SVIS are put forward. Then, the concepts of θ -belief function, θ -plausibility function, θ -belief reduction and θ -plausibility reduction are given. Moreover, several feature selection algorithms based on the D–S evidence theory in an SVIS are proposed. Experimental results and statistical test show that the proposed metric is insensitive to noise because it comprehensively considers the evidence at all levels, and the proposed algorithms are more robust than several state-of-the-art feature selection algorithms.</description><identifier>ISSN: 0269-2821</identifier><identifier>EISSN: 1573-7462</identifier><identifier>DOI: 10.1007/s10462-022-10241-1</identifier><language>eng</language><publisher>Dordrecht: Springer Netherlands</publisher><subject>Algorithms ; Analysis ; Artificial Intelligence ; Big Data ; Computer Science ; Data mining ; Feature selection ; Noise sensitivity ; Robustness ; Set theory ; Similarity ; Statistical tests</subject><ispartof>The Artificial intelligence review, 2023-03, Vol.56 (3), p.2667-2696</ispartof><rights>The Author(s), under exclusive licence to Springer Nature B.V. 2022. Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><rights>COPYRIGHT 2023 Springer</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c288t-cab63347561cf1902e9ed7b8de135905abdad05058a4eed65383466214abc99c3</citedby><cites>FETCH-LOGICAL-c288t-cab63347561cf1902e9ed7b8de135905abdad05058a4eed65383466214abc99c3</cites><orcidid>0000-0002-7931-025X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2777522719/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2777522719?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,21381,21394,27305,27924,27925,33611,33906,34135,36060,43733,43892,44363,74221,74409,74895</link.rule.ids></links><search><creatorcontrib>Wang, Yini</creatorcontrib><creatorcontrib>Wang, Sichun</creatorcontrib><title>Feature selection for set-valued data based on D–S evidence theory</title><title>The Artificial intelligence review</title><addtitle>Artif Intell Rev</addtitle><description>Feature selection is one basic and critical technology for data mining, especially in current “big data era”. Rough set theory is sensitive to noise in feature selection due the stringent condition of an equivalence relation. However, D–S evidence theory is flexible to measure uncertainty of information. In this paper, we introduce robust feature evaluation metrics “belief function” and “plausibility function” into feature selection algorithm to avoid the defect that classification effect is affected by noise such as missing values, confusing data, etc. Firstly, similarity between information values in a set-valued information system (SVIS) is introduced and a variable parameter to control the similarity of samples is given. Secondly, θ -lower and θ -upper approximations in an SVIS are put forward. Then, the concepts of θ -belief function, θ -plausibility function, θ -belief reduction and θ -plausibility reduction are given. Moreover, several feature selection algorithms based on the D–S evidence theory in an SVIS are proposed. Experimental results and statistical test show that the proposed metric is insensitive to noise because it comprehensively considers the evidence at all levels, and the proposed algorithms are more robust than several state-of-the-art feature selection algorithms.</description><subject>Algorithms</subject><subject>Analysis</subject><subject>Artificial Intelligence</subject><subject>Big Data</subject><subject>Computer Science</subject><subject>Data mining</subject><subject>Feature selection</subject><subject>Noise sensitivity</subject><subject>Robustness</subject><subject>Set theory</subject><subject>Similarity</subject><subject>Statistical tests</subject><issn>0269-2821</issn><issn>1573-7462</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2023</creationdate><recordtype>article</recordtype><sourceid>ALSLI</sourceid><sourceid>CNYFK</sourceid><sourceid>F2A</sourceid><sourceid>M0C</sourceid><sourceid>M1O</sourceid><recordid>eNp9kE1OwzAQhS0EEqVwAVaRWBv8E8fJsmopIFViAawtx56UVGlcbKdSd9yBG3ISDEFih7wYz8z7ZkYPoUtKrikh8iZQkhcME8YwJSynmB6hCRWSY5nqx2hCWFFhVjJ6is5C2BBCBMv5BC2WoOPgIQvQgYmt67PG-ZRFvNfdADazOuqs1iF9U3Px-f7xlMG-tdAbyOIrOH84RyeN7gJc_MYpelnePs_v8erx7mE-W2HDyjJio-uC81yKgpqGVoRBBVbWpQXKRUWErq22RBBR6hzAFoKXPC8KRnNdm6oyfIquxrk7794GCFFt3OD7tFIxKaVgTNIqqa5H1Vp3oNq-cdFrk56FbWtcD02b6jPJEyEFlQlgI2C8C8FDo3a-3Wp_UJSob3vVaK9K9qofexVNEB-hkMT9GvzfLf9QX_iCfF4</recordid><startdate>20230301</startdate><enddate>20230301</enddate><creator>Wang, Yini</creator><creator>Wang, Sichun</creator><general>Springer Netherlands</general><general>Springer</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8AO</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ALSLI</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>CNYFK</scope><scope>DWQXO</scope><scope>E3H</scope><scope>F2A</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M1O</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PSYQQ</scope><scope>Q9U</scope><orcidid>https://orcid.org/0000-0002-7931-025X</orcidid></search><sort><creationdate>20230301</creationdate><title>Feature selection for set-valued data based on D–S evidence theory</title><author>Wang, Yini ; Wang, Sichun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c288t-cab63347561cf1902e9ed7b8de135905abdad05058a4eed65383466214abc99c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2023</creationdate><topic>Algorithms</topic><topic>Analysis</topic><topic>Artificial Intelligence</topic><topic>Big Data</topic><topic>Computer Science</topic><topic>Data mining</topic><topic>Feature selection</topic><topic>Noise sensitivity</topic><topic>Robustness</topic><topic>Set theory</topic><topic>Similarity</topic><topic>Statistical tests</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Yini</creatorcontrib><creatorcontrib>Wang, Sichun</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Social Science Premium Collection</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>Library &amp; Information Science Collection</collection><collection>ProQuest Central</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Library Science Database</collection><collection>Advanced Technologies &amp; Aerospace Database</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Psychology</collection><collection>ProQuest Central Basic</collection><jtitle>The Artificial intelligence review</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Yini</au><au>Wang, Sichun</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Feature selection for set-valued data based on D–S evidence theory</atitle><jtitle>The Artificial intelligence review</jtitle><stitle>Artif Intell Rev</stitle><date>2023-03-01</date><risdate>2023</risdate><volume>56</volume><issue>3</issue><spage>2667</spage><epage>2696</epage><pages>2667-2696</pages><issn>0269-2821</issn><eissn>1573-7462</eissn><abstract>Feature selection is one basic and critical technology for data mining, especially in current “big data era”. Rough set theory is sensitive to noise in feature selection due the stringent condition of an equivalence relation. However, D–S evidence theory is flexible to measure uncertainty of information. In this paper, we introduce robust feature evaluation metrics “belief function” and “plausibility function” into feature selection algorithm to avoid the defect that classification effect is affected by noise such as missing values, confusing data, etc. Firstly, similarity between information values in a set-valued information system (SVIS) is introduced and a variable parameter to control the similarity of samples is given. Secondly, θ -lower and θ -upper approximations in an SVIS are put forward. Then, the concepts of θ -belief function, θ -plausibility function, θ -belief reduction and θ -plausibility reduction are given. Moreover, several feature selection algorithms based on the D–S evidence theory in an SVIS are proposed. Experimental results and statistical test show that the proposed metric is insensitive to noise because it comprehensively considers the evidence at all levels, and the proposed algorithms are more robust than several state-of-the-art feature selection algorithms.</abstract><cop>Dordrecht</cop><pub>Springer Netherlands</pub><doi>10.1007/s10462-022-10241-1</doi><tpages>30</tpages><orcidid>https://orcid.org/0000-0002-7931-025X</orcidid></addata></record>
fulltext fulltext
identifier ISSN: 0269-2821
ispartof The Artificial intelligence review, 2023-03, Vol.56 (3), p.2667-2696
issn 0269-2821
1573-7462
language eng
recordid cdi_proquest_journals_2777522719
source Library & Information Science Abstracts (LISA); Social Science Premium Collection; ABI/INFORM Global; Library & Information Science Collection; Springer Link
subjects Algorithms
Analysis
Artificial Intelligence
Big Data
Computer Science
Data mining
Feature selection
Noise sensitivity
Robustness
Set theory
Similarity
Statistical tests
title Feature selection for set-valued data based on D–S evidence theory
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-06T18%3A13%3A53IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Feature%20selection%20for%20set-valued%20data%20based%20on%20D%E2%80%93S%20evidence%20theory&rft.jtitle=The%20Artificial%20intelligence%20review&rft.au=Wang,%20Yini&rft.date=2023-03-01&rft.volume=56&rft.issue=3&rft.spage=2667&rft.epage=2696&rft.pages=2667-2696&rft.issn=0269-2821&rft.eissn=1573-7462&rft_id=info:doi/10.1007/s10462-022-10241-1&rft_dat=%3Cgale_proqu%3EA737757517%3C/gale_proqu%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c288t-cab63347561cf1902e9ed7b8de135905abdad05058a4eed65383466214abc99c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2777522719&rft_id=info:pmid/&rft_galeid=A737757517&rfr_iscdi=true