Loading…

Nearest neighbor classification of categorical data by attributes weighting

•An effective solution for nearest neighbor classification on categorical data.•Two global attribute-weighting approaches applied for categorical data classification.•Two local attribute-weighting approaches applied for categorical data classification.•Strong results of the new classifiers compared...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications 2015-04, Vol.42 (6), p.3142-3149
Main Authors: Chen, Lifei, Guo, Gongde
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c333t-e2fd61a78fee35ec9f0853e358d66e5376518a5ced8ed94f03568919b19a12013
cites cdi_FETCH-LOGICAL-c333t-e2fd61a78fee35ec9f0853e358d66e5376518a5ced8ed94f03568919b19a12013
container_end_page 3149
container_issue 6
container_start_page 3142
container_title Expert systems with applications
container_volume 42
creator Chen, Lifei
Guo, Gongde
description •An effective solution for nearest neighbor classification on categorical data.•Two global attribute-weighting approaches applied for categorical data classification.•Two local attribute-weighting approaches applied for categorical data classification.•Strong results of the new classifiers compared with the traditional kNN and the decision tree.•Detailed analysis on the different behaviors of the various attribute-weighting methods. Subspace classification of categorical data is an essential process for many real-world applications such as computer-aided medical diagnosis and collaborative recommendation. The nearest neighbor classifiers have sparked wide interest from these applications because of their simplicity and flexibility. However, they become ineffective when applied to categorical data, due to the lack of a well-defined distance measure used to compute dissimilarities between categorical samples in the projected subspaces. In this paper, we tackle the problem by defining a series of weighted distance functions for categorical attributes, and applying them to derive new nearest neighbor classifiers. Four attribute-weighting measures are proposed, with two defined on global feature-ranking approaches while the other two on local approaches. The experimental results conducted on real categorical data sets demonstrate that all four classifiers outperform consistently the traditional methods, and show the suitability of the proposal for the real applications in terms of automated feature selection.
doi_str_mv 10.1016/j.eswa.2014.12.002
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1685772274</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0957417414007672</els_id><sourcerecordid>1685772274</sourcerecordid><originalsourceid>FETCH-LOGICAL-c333t-e2fd61a78fee35ec9f0853e358d66e5376518a5ced8ed94f03568919b19a12013</originalsourceid><addsrcrecordid>eNp9kM1OwzAQhC0EEqXwApx85JLgjZPYkbigij9RwQXOluNsiqs0LrZL1bfHUTlz2tFqvtXsEHINLAcG9e06x7DXecGgzKHIGStOyAyk4FktGn5KZqypRFaCKM_JRQhrxkAwJmbk9Q21xxDpiHb11TpPzaBDsL01Olo3UtfTpHDlfNoMtNNR0_ZAdYzetruIge4nMtpxdUnOej0EvPqbc_L5-PCxeM6W708vi_tlZjjnMcOi72rQQvaIvELT9ExWPEnZ1TVWXNQVSF0Z7CR2TdkzXtWygaaFRkP6kM_JzfHu1rvvXQqvNjYYHAY9otsFBbWshCgKUSZrcbQa70Lw2KuttxvtDwqYmppTazU1p6bmFBQqNZeguyOE6Ykfi14FY3FMgaxHE1Xn7H_4L1dYd7c</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1685772274</pqid></control><display><type>article</type><title>Nearest neighbor classification of categorical data by attributes weighting</title><source>ScienceDirect Journals</source><creator>Chen, Lifei ; Guo, Gongde</creator><creatorcontrib>Chen, Lifei ; Guo, Gongde</creatorcontrib><description>•An effective solution for nearest neighbor classification on categorical data.•Two global attribute-weighting approaches applied for categorical data classification.•Two local attribute-weighting approaches applied for categorical data classification.•Strong results of the new classifiers compared with the traditional kNN and the decision tree.•Detailed analysis on the different behaviors of the various attribute-weighting methods. Subspace classification of categorical data is an essential process for many real-world applications such as computer-aided medical diagnosis and collaborative recommendation. The nearest neighbor classifiers have sparked wide interest from these applications because of their simplicity and flexibility. However, they become ineffective when applied to categorical data, due to the lack of a well-defined distance measure used to compute dissimilarities between categorical samples in the projected subspaces. In this paper, we tackle the problem by defining a series of weighted distance functions for categorical attributes, and applying them to derive new nearest neighbor classifiers. Four attribute-weighting measures are proposed, with two defined on global feature-ranking approaches while the other two on local approaches. The experimental results conducted on real categorical data sets demonstrate that all four classifiers outperform consistently the traditional methods, and show the suitability of the proposal for the real applications in terms of automated feature selection.</description><identifier>ISSN: 0957-4174</identifier><identifier>EISSN: 1873-6793</identifier><identifier>DOI: 10.1016/j.eswa.2014.12.002</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>Attribute weighting ; Categorical data ; Classification ; Classifiers ; Distance measure ; Expert systems ; Feature selection ; Flexibility ; Mathematical analysis ; Mathematical models ; Nearest neighbor classification ; Projected subspace ; Proposals ; Subspaces</subject><ispartof>Expert systems with applications, 2015-04, Vol.42 (6), p.3142-3149</ispartof><rights>2014 Elsevier Ltd</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c333t-e2fd61a78fee35ec9f0853e358d66e5376518a5ced8ed94f03568919b19a12013</citedby><cites>FETCH-LOGICAL-c333t-e2fd61a78fee35ec9f0853e358d66e5376518a5ced8ed94f03568919b19a12013</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Chen, Lifei</creatorcontrib><creatorcontrib>Guo, Gongde</creatorcontrib><title>Nearest neighbor classification of categorical data by attributes weighting</title><title>Expert systems with applications</title><description>•An effective solution for nearest neighbor classification on categorical data.•Two global attribute-weighting approaches applied for categorical data classification.•Two local attribute-weighting approaches applied for categorical data classification.•Strong results of the new classifiers compared with the traditional kNN and the decision tree.•Detailed analysis on the different behaviors of the various attribute-weighting methods. Subspace classification of categorical data is an essential process for many real-world applications such as computer-aided medical diagnosis and collaborative recommendation. The nearest neighbor classifiers have sparked wide interest from these applications because of their simplicity and flexibility. However, they become ineffective when applied to categorical data, due to the lack of a well-defined distance measure used to compute dissimilarities between categorical samples in the projected subspaces. In this paper, we tackle the problem by defining a series of weighted distance functions for categorical attributes, and applying them to derive new nearest neighbor classifiers. Four attribute-weighting measures are proposed, with two defined on global feature-ranking approaches while the other two on local approaches. The experimental results conducted on real categorical data sets demonstrate that all four classifiers outperform consistently the traditional methods, and show the suitability of the proposal for the real applications in terms of automated feature selection.</description><subject>Attribute weighting</subject><subject>Categorical data</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Distance measure</subject><subject>Expert systems</subject><subject>Feature selection</subject><subject>Flexibility</subject><subject>Mathematical analysis</subject><subject>Mathematical models</subject><subject>Nearest neighbor classification</subject><subject>Projected subspace</subject><subject>Proposals</subject><subject>Subspaces</subject><issn>0957-4174</issn><issn>1873-6793</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNp9kM1OwzAQhC0EEqXwApx85JLgjZPYkbigij9RwQXOluNsiqs0LrZL1bfHUTlz2tFqvtXsEHINLAcG9e06x7DXecGgzKHIGStOyAyk4FktGn5KZqypRFaCKM_JRQhrxkAwJmbk9Q21xxDpiHb11TpPzaBDsL01Olo3UtfTpHDlfNoMtNNR0_ZAdYzetruIge4nMtpxdUnOej0EvPqbc_L5-PCxeM6W708vi_tlZjjnMcOi72rQQvaIvELT9ExWPEnZ1TVWXNQVSF0Z7CR2TdkzXtWygaaFRkP6kM_JzfHu1rvvXQqvNjYYHAY9otsFBbWshCgKUSZrcbQa70Lw2KuttxvtDwqYmppTazU1p6bmFBQqNZeguyOE6Ykfi14FY3FMgaxHE1Xn7H_4L1dYd7c</recordid><startdate>20150415</startdate><enddate>20150415</enddate><creator>Chen, Lifei</creator><creator>Guo, Gongde</creator><general>Elsevier Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope></search><sort><creationdate>20150415</creationdate><title>Nearest neighbor classification of categorical data by attributes weighting</title><author>Chen, Lifei ; Guo, Gongde</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c333t-e2fd61a78fee35ec9f0853e358d66e5376518a5ced8ed94f03568919b19a12013</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Attribute weighting</topic><topic>Categorical data</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Distance measure</topic><topic>Expert systems</topic><topic>Feature selection</topic><topic>Flexibility</topic><topic>Mathematical analysis</topic><topic>Mathematical models</topic><topic>Nearest neighbor classification</topic><topic>Projected subspace</topic><topic>Proposals</topic><topic>Subspaces</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Chen, Lifei</creatorcontrib><creatorcontrib>Guo, Gongde</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>Expert systems with applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Chen, Lifei</au><au>Guo, Gongde</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Nearest neighbor classification of categorical data by attributes weighting</atitle><jtitle>Expert systems with applications</jtitle><date>2015-04-15</date><risdate>2015</risdate><volume>42</volume><issue>6</issue><spage>3142</spage><epage>3149</epage><pages>3142-3149</pages><issn>0957-4174</issn><eissn>1873-6793</eissn><abstract>•An effective solution for nearest neighbor classification on categorical data.•Two global attribute-weighting approaches applied for categorical data classification.•Two local attribute-weighting approaches applied for categorical data classification.•Strong results of the new classifiers compared with the traditional kNN and the decision tree.•Detailed analysis on the different behaviors of the various attribute-weighting methods. Subspace classification of categorical data is an essential process for many real-world applications such as computer-aided medical diagnosis and collaborative recommendation. The nearest neighbor classifiers have sparked wide interest from these applications because of their simplicity and flexibility. However, they become ineffective when applied to categorical data, due to the lack of a well-defined distance measure used to compute dissimilarities between categorical samples in the projected subspaces. In this paper, we tackle the problem by defining a series of weighted distance functions for categorical attributes, and applying them to derive new nearest neighbor classifiers. Four attribute-weighting measures are proposed, with two defined on global feature-ranking approaches while the other two on local approaches. The experimental results conducted on real categorical data sets demonstrate that all four classifiers outperform consistently the traditional methods, and show the suitability of the proposal for the real applications in terms of automated feature selection.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.eswa.2014.12.002</doi><tpages>8</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0957-4174
ispartof Expert systems with applications, 2015-04, Vol.42 (6), p.3142-3149
issn 0957-4174
1873-6793
language eng
recordid cdi_proquest_miscellaneous_1685772274
source ScienceDirect Journals
subjects Attribute weighting
Categorical data
Classification
Classifiers
Distance measure
Expert systems
Feature selection
Flexibility
Mathematical analysis
Mathematical models
Nearest neighbor classification
Projected subspace
Proposals
Subspaces
title Nearest neighbor classification of categorical data by attributes weighting
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-08T09%3A14%3A44IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Nearest%20neighbor%20classification%20of%20categorical%20data%20by%20attributes%20weighting&rft.jtitle=Expert%20systems%20with%20applications&rft.au=Chen,%20Lifei&rft.date=2015-04-15&rft.volume=42&rft.issue=6&rft.spage=3142&rft.epage=3149&rft.pages=3142-3149&rft.issn=0957-4174&rft.eissn=1873-6793&rft_id=info:doi/10.1016/j.eswa.2014.12.002&rft_dat=%3Cproquest_cross%3E1685772274%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c333t-e2fd61a78fee35ec9f0853e358d66e5376518a5ced8ed94f03568919b19a12013%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1685772274&rft_id=info:pmid/&rfr_iscdi=true