Loading…
A systematic review of multi-label feature selection and a new method based on label construction
Each example in a multi-label dataset is associated with multiple labels, which are often correlated. Learning from this data can be improved when dimensionality reduction tasks, such as feature selection, are applied. The standard approach for multi-label feature selection transforms the multi-labe...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2016-03, Vol.180, p.3-15 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c392t-3a44c8ca1b80ba2c368e372fd7f3d965d789fb8ec7b4372e664b5f7da1b3b7ed3 |
---|---|
cites | cdi_FETCH-LOGICAL-c392t-3a44c8ca1b80ba2c368e372fd7f3d965d789fb8ec7b4372e664b5f7da1b3b7ed3 |
container_end_page | 15 |
container_issue | |
container_start_page | 3 |
container_title | Neurocomputing (Amsterdam) |
container_volume | 180 |
creator | Spolaôr, Newton Monard, Maria Carolina Tsoumakas, Grigorios Lee, Huei Diana |
description | Each example in a multi-label dataset is associated with multiple labels, which are often correlated. Learning from this data can be improved when dimensionality reduction tasks, such as feature selection, are applied. The standard approach for multi-label feature selection transforms the multi-label dataset into single-label datasets before using traditional feature selection algorithms. However, this approach often ignores label dependence. In this work, we propose an alternative method, LCFS, that constructs new labels based on relations between the original labels. By doing so, the label set from the data is augmented with second-order information before applying the standard approach. To assess LCFS, an experimental evaluation using Information Gain as a measure to estimate the importance of features was carried out on 10 benchmark multi-label datasets. This evaluation compared four LCFS settings with the standard approach, using random feature selection as a reference. For each dataset, the performance of a feature selection method is estimated by the quality of the classifiers built from the data described by the features selected by the method. The results show that a simple LCFS setting gave rise to classifiers similar to, or better than, the ones built using the standard approach. Furthermore, this work also pioneers the use of the systematic review method to survey the related work on multi-label feature selection. The summary of the 99 papers found promotes the idea that exploring label dependence during feature selection can lead to good results.
•By constructing new labels, LCFS considers label relations from a multi-label dataset.•A LCFS setting achieved performance competitive with the standard approach.•LCFS contributed to outperform classifiers based on experimental references.•We also pioneer the systematic review use on multi-label feature selection literature.•The summary of 99 papers found evidence that agrees with LCFS achievements. |
doi_str_mv | 10.1016/j.neucom.2015.07.118 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1793240314</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0925231215016197</els_id><sourcerecordid>1793240314</sourcerecordid><originalsourceid>FETCH-LOGICAL-c392t-3a44c8ca1b80ba2c368e372fd7f3d965d789fb8ec7b4372e664b5f7da1b3b7ed3</originalsourceid><addsrcrecordid>eNp9kE1LxDAQhoMouK7-Aw85emnNR9ukF2FZ_IIFL3oOaTLFLG2zJqnivzdrPXsamHneF-ZB6JqSkhLa3O7LCWbjx5IRWpdElJTKE7SiUrBCMtmcohVpWV0wTtk5uohxTwgVlLUrpDc4fscEo07O4ACfDr6w7_E4D8kVg-5gwD3oNAfAEQYwyfkJ68lijaeMjpDevcWdjmBxviwJ46eYwvwLX6KzXg8Rrv7mGr093L9un4rdy-PzdrMrDG9ZKriuKiONpp0knWaGNxK4YL0VPbdtU1sh276TYERX5T00TdXVvbA5wDsBlq_RzdJ7CP5jhpjU6KKBYdAT-DkqKlrOKsJpldFqQU3wMQbo1SG4UYdvRYk6GlV7tRhVR6OKCJWN5tjdEoP8RhYVVDQOJgPWhSxGWe_-L_gB_8aC2Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1793240314</pqid></control><display><type>article</type><title>A systematic review of multi-label feature selection and a new method based on label construction</title><source>ScienceDirect Freedom Collection</source><creator>Spolaôr, Newton ; Monard, Maria Carolina ; Tsoumakas, Grigorios ; Lee, Huei Diana</creator><creatorcontrib>Spolaôr, Newton ; Monard, Maria Carolina ; Tsoumakas, Grigorios ; Lee, Huei Diana</creatorcontrib><description>Each example in a multi-label dataset is associated with multiple labels, which are often correlated. Learning from this data can be improved when dimensionality reduction tasks, such as feature selection, are applied. The standard approach for multi-label feature selection transforms the multi-label dataset into single-label datasets before using traditional feature selection algorithms. However, this approach often ignores label dependence. In this work, we propose an alternative method, LCFS, that constructs new labels based on relations between the original labels. By doing so, the label set from the data is augmented with second-order information before applying the standard approach. To assess LCFS, an experimental evaluation using Information Gain as a measure to estimate the importance of features was carried out on 10 benchmark multi-label datasets. This evaluation compared four LCFS settings with the standard approach, using random feature selection as a reference. For each dataset, the performance of a feature selection method is estimated by the quality of the classifiers built from the data described by the features selected by the method. The results show that a simple LCFS setting gave rise to classifiers similar to, or better than, the ones built using the standard approach. Furthermore, this work also pioneers the use of the systematic review method to survey the related work on multi-label feature selection. The summary of the 99 papers found promotes the idea that exploring label dependence during feature selection can lead to good results.
•By constructing new labels, LCFS considers label relations from a multi-label dataset.•A LCFS setting achieved performance competitive with the standard approach.•LCFS contributed to outperform classifiers based on experimental references.•We also pioneer the systematic review use on multi-label feature selection literature.•The summary of 99 papers found evidence that agrees with LCFS achievements.</description><identifier>ISSN: 0925-2312</identifier><identifier>EISSN: 1872-8286</identifier><identifier>DOI: 10.1016/j.neucom.2015.07.118</identifier><language>eng</language><publisher>Elsevier B.V</publisher><subject>Binary relevance ; Classifiers ; Construction ; Feature ranking ; Filter feature selection ; Gain ; Information gain ; Labels ; Learning ; Low cycle fatigue ; Systematic review ; Tasks</subject><ispartof>Neurocomputing (Amsterdam), 2016-03, Vol.180, p.3-15</ispartof><rights>2015 Elsevier B.V.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c392t-3a44c8ca1b80ba2c368e372fd7f3d965d789fb8ec7b4372e664b5f7da1b3b7ed3</citedby><cites>FETCH-LOGICAL-c392t-3a44c8ca1b80ba2c368e372fd7f3d965d789fb8ec7b4372e664b5f7da1b3b7ed3</cites><orcidid>0000-0002-7879-669X ; 0000-0001-6004-7407 ; 0000-0003-0748-3693</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids></links><search><creatorcontrib>Spolaôr, Newton</creatorcontrib><creatorcontrib>Monard, Maria Carolina</creatorcontrib><creatorcontrib>Tsoumakas, Grigorios</creatorcontrib><creatorcontrib>Lee, Huei Diana</creatorcontrib><title>A systematic review of multi-label feature selection and a new method based on label construction</title><title>Neurocomputing (Amsterdam)</title><description>Each example in a multi-label dataset is associated with multiple labels, which are often correlated. Learning from this data can be improved when dimensionality reduction tasks, such as feature selection, are applied. The standard approach for multi-label feature selection transforms the multi-label dataset into single-label datasets before using traditional feature selection algorithms. However, this approach often ignores label dependence. In this work, we propose an alternative method, LCFS, that constructs new labels based on relations between the original labels. By doing so, the label set from the data is augmented with second-order information before applying the standard approach. To assess LCFS, an experimental evaluation using Information Gain as a measure to estimate the importance of features was carried out on 10 benchmark multi-label datasets. This evaluation compared four LCFS settings with the standard approach, using random feature selection as a reference. For each dataset, the performance of a feature selection method is estimated by the quality of the classifiers built from the data described by the features selected by the method. The results show that a simple LCFS setting gave rise to classifiers similar to, or better than, the ones built using the standard approach. Furthermore, this work also pioneers the use of the systematic review method to survey the related work on multi-label feature selection. The summary of the 99 papers found promotes the idea that exploring label dependence during feature selection can lead to good results.
•By constructing new labels, LCFS considers label relations from a multi-label dataset.•A LCFS setting achieved performance competitive with the standard approach.•LCFS contributed to outperform classifiers based on experimental references.•We also pioneer the systematic review use on multi-label feature selection literature.•The summary of 99 papers found evidence that agrees with LCFS achievements.</description><subject>Binary relevance</subject><subject>Classifiers</subject><subject>Construction</subject><subject>Feature ranking</subject><subject>Filter feature selection</subject><subject>Gain</subject><subject>Information gain</subject><subject>Labels</subject><subject>Learning</subject><subject>Low cycle fatigue</subject><subject>Systematic review</subject><subject>Tasks</subject><issn>0925-2312</issn><issn>1872-8286</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2016</creationdate><recordtype>article</recordtype><recordid>eNp9kE1LxDAQhoMouK7-Aw85emnNR9ukF2FZ_IIFL3oOaTLFLG2zJqnivzdrPXsamHneF-ZB6JqSkhLa3O7LCWbjx5IRWpdElJTKE7SiUrBCMtmcohVpWV0wTtk5uohxTwgVlLUrpDc4fscEo07O4ACfDr6w7_E4D8kVg-5gwD3oNAfAEQYwyfkJ68lijaeMjpDevcWdjmBxviwJ46eYwvwLX6KzXg8Rrv7mGr093L9un4rdy-PzdrMrDG9ZKriuKiONpp0knWaGNxK4YL0VPbdtU1sh276TYERX5T00TdXVvbA5wDsBlq_RzdJ7CP5jhpjU6KKBYdAT-DkqKlrOKsJpldFqQU3wMQbo1SG4UYdvRYk6GlV7tRhVR6OKCJWN5tjdEoP8RhYVVDQOJgPWhSxGWe_-L_gB_8aC2Q</recordid><startdate>20160305</startdate><enddate>20160305</enddate><creator>Spolaôr, Newton</creator><creator>Monard, Maria Carolina</creator><creator>Tsoumakas, Grigorios</creator><creator>Lee, Huei Diana</creator><general>Elsevier B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-7879-669X</orcidid><orcidid>https://orcid.org/0000-0001-6004-7407</orcidid><orcidid>https://orcid.org/0000-0003-0748-3693</orcidid></search><sort><creationdate>20160305</creationdate><title>A systematic review of multi-label feature selection and a new method based on label construction</title><author>Spolaôr, Newton ; Monard, Maria Carolina ; Tsoumakas, Grigorios ; Lee, Huei Diana</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c392t-3a44c8ca1b80ba2c368e372fd7f3d965d789fb8ec7b4372e664b5f7da1b3b7ed3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2016</creationdate><topic>Binary relevance</topic><topic>Classifiers</topic><topic>Construction</topic><topic>Feature ranking</topic><topic>Filter feature selection</topic><topic>Gain</topic><topic>Information gain</topic><topic>Labels</topic><topic>Learning</topic><topic>Low cycle fatigue</topic><topic>Systematic review</topic><topic>Tasks</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Spolaôr, Newton</creatorcontrib><creatorcontrib>Monard, Maria Carolina</creatorcontrib><creatorcontrib>Tsoumakas, Grigorios</creatorcontrib><creatorcontrib>Lee, Huei Diana</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Neurocomputing (Amsterdam)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Spolaôr, Newton</au><au>Monard, Maria Carolina</au><au>Tsoumakas, Grigorios</au><au>Lee, Huei Diana</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A systematic review of multi-label feature selection and a new method based on label construction</atitle><jtitle>Neurocomputing (Amsterdam)</jtitle><date>2016-03-05</date><risdate>2016</risdate><volume>180</volume><spage>3</spage><epage>15</epage><pages>3-15</pages><issn>0925-2312</issn><eissn>1872-8286</eissn><abstract>Each example in a multi-label dataset is associated with multiple labels, which are often correlated. Learning from this data can be improved when dimensionality reduction tasks, such as feature selection, are applied. The standard approach for multi-label feature selection transforms the multi-label dataset into single-label datasets before using traditional feature selection algorithms. However, this approach often ignores label dependence. In this work, we propose an alternative method, LCFS, that constructs new labels based on relations between the original labels. By doing so, the label set from the data is augmented with second-order information before applying the standard approach. To assess LCFS, an experimental evaluation using Information Gain as a measure to estimate the importance of features was carried out on 10 benchmark multi-label datasets. This evaluation compared four LCFS settings with the standard approach, using random feature selection as a reference. For each dataset, the performance of a feature selection method is estimated by the quality of the classifiers built from the data described by the features selected by the method. The results show that a simple LCFS setting gave rise to classifiers similar to, or better than, the ones built using the standard approach. Furthermore, this work also pioneers the use of the systematic review method to survey the related work on multi-label feature selection. The summary of the 99 papers found promotes the idea that exploring label dependence during feature selection can lead to good results.
•By constructing new labels, LCFS considers label relations from a multi-label dataset.•A LCFS setting achieved performance competitive with the standard approach.•LCFS contributed to outperform classifiers based on experimental references.•We also pioneer the systematic review use on multi-label feature selection literature.•The summary of 99 papers found evidence that agrees with LCFS achievements.</abstract><pub>Elsevier B.V</pub><doi>10.1016/j.neucom.2015.07.118</doi><tpages>13</tpages><orcidid>https://orcid.org/0000-0002-7879-669X</orcidid><orcidid>https://orcid.org/0000-0001-6004-7407</orcidid><orcidid>https://orcid.org/0000-0003-0748-3693</orcidid></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0925-2312 |
ispartof | Neurocomputing (Amsterdam), 2016-03, Vol.180, p.3-15 |
issn | 0925-2312 1872-8286 |
language | eng |
recordid | cdi_proquest_miscellaneous_1793240314 |
source | ScienceDirect Freedom Collection |
subjects | Binary relevance Classifiers Construction Feature ranking Filter feature selection Gain Information gain Labels Learning Low cycle fatigue Systematic review Tasks |
title | A systematic review of multi-label feature selection and a new method based on label construction |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-14T02%3A11%3A11IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20systematic%20review%20of%20multi-label%20feature%20selection%20and%20a%20new%20method%20based%20on%20label%20construction&rft.jtitle=Neurocomputing%20(Amsterdam)&rft.au=Spola%C3%B4r,%20Newton&rft.date=2016-03-05&rft.volume=180&rft.spage=3&rft.epage=15&rft.pages=3-15&rft.issn=0925-2312&rft.eissn=1872-8286&rft_id=info:doi/10.1016/j.neucom.2015.07.118&rft_dat=%3Cproquest_cross%3E1793240314%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c392t-3a44c8ca1b80ba2c368e372fd7f3d965d789fb8ec7b4372e664b5f7da1b3b7ed3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1793240314&rft_id=info:pmid/&rfr_iscdi=true |