Loading…

A novel feature selection method for text classification using association rules and clustering

Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification,...

Full description

Saved in:
Bibliographic Details
Published in:Journal of information science 2015-02, Vol.41 (1), p.3-15
Main Authors: Sheydaei, Navid, Saraee, Mohamad, Shahgholian, Azar
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3
cites cdi_FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3
container_end_page 15
container_issue 1
container_start_page 3
container_title Journal of information science
container_volume 41
creator Sheydaei, Navid
Saraee, Mohamad
Shahgholian, Azar
description Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.
doi_str_mv 10.1177/0165551514550143
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1667945517</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_0165551514550143</sage_id><sourcerecordid>3561163021</sourcerecordid><originalsourceid>FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3</originalsourceid><addsrcrecordid>eNqNkc1LxDAQxYMouK7ePQa8eKlmmq_2uCx-wYKXvZeYTtcu3WZNUtH_3tR6kAXB08B7v3kwbwi5BHYDoPUtAyWlBAlCSgaCH5EZaAGZEoU8JrPRzkb_lJyFsGWMyZKLGakWtHfv2NEGTRw80oAd2ti6nu4wvrqaNs7TiB-R2s6E0DatNd_2ENp-Q5PkbDspfugwUNPXCR1CRJ-Ac3LSmC7gxc-ck_X93Xr5mK2eH56Wi1VmBeiYlaVAXTJR58I2zKhccsnVi2Z5UQuTPJC8kEbZ2sgc8ho51ozlpZWmboTlc3I9xe69exswxGrXBotdZ3p0Q6hAKV2mZkD_B2VMpb6KhF4doFs3-D7dkSihOIdcjYFsoqx3IXhsqr1vd8Z_VsCq8TfV4W_SSjatBLPBX6F_8V8TDI2D</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1646331267</pqid></control><display><type>article</type><title>A novel feature selection method for text classification using association rules and clustering</title><source>Library &amp; Information Science Abstracts (LISA)</source><source>SAGE</source><creator>Sheydaei, Navid ; Saraee, Mohamad ; Shahgholian, Azar</creator><creatorcontrib>Sheydaei, Navid ; Saraee, Mohamad ; Shahgholian, Azar</creatorcontrib><description>Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.</description><identifier>ISSN: 0165-5515</identifier><identifier>EISSN: 1741-6485</identifier><identifier>DOI: 10.1177/0165551514550143</identifier><identifier>CODEN: JISCDI</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Acceptability ; Accuracy ; Algorithms ; Classification ; Classifiers ; Clustering ; Feature extraction ; Feature selection ; Studies ; Tasks ; Text categorization ; Texts ; Training ; Vocabularies &amp; taxonomies</subject><ispartof>Journal of information science, 2015-02, Vol.41 (1), p.3-15</ispartof><rights>The Author(s) 2014</rights><rights>Copyright Bowker-Saur Ltd. Feb 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3</citedby><cites>FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925,34135,34136,79364</link.rule.ids></links><search><creatorcontrib>Sheydaei, Navid</creatorcontrib><creatorcontrib>Saraee, Mohamad</creatorcontrib><creatorcontrib>Shahgholian, Azar</creatorcontrib><title>A novel feature selection method for text classification using association rules and clustering</title><title>Journal of information science</title><description>Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.</description><subject>Acceptability</subject><subject>Accuracy</subject><subject>Algorithms</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Clustering</subject><subject>Feature extraction</subject><subject>Feature selection</subject><subject>Studies</subject><subject>Tasks</subject><subject>Text categorization</subject><subject>Texts</subject><subject>Training</subject><subject>Vocabularies &amp; taxonomies</subject><issn>0165-5515</issn><issn>1741-6485</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>F2A</sourceid><recordid>eNqNkc1LxDAQxYMouK7ePQa8eKlmmq_2uCx-wYKXvZeYTtcu3WZNUtH_3tR6kAXB08B7v3kwbwi5BHYDoPUtAyWlBAlCSgaCH5EZaAGZEoU8JrPRzkb_lJyFsGWMyZKLGakWtHfv2NEGTRw80oAd2ti6nu4wvrqaNs7TiB-R2s6E0DatNd_2ENp-Q5PkbDspfugwUNPXCR1CRJ-Ac3LSmC7gxc-ck_X93Xr5mK2eH56Wi1VmBeiYlaVAXTJR58I2zKhccsnVi2Z5UQuTPJC8kEbZ2sgc8ho51ozlpZWmboTlc3I9xe69exswxGrXBotdZ3p0Q6hAKV2mZkD_B2VMpb6KhF4doFs3-D7dkSihOIdcjYFsoqx3IXhsqr1vd8Z_VsCq8TfV4W_SSjatBLPBX6F_8V8TDI2D</recordid><startdate>20150201</startdate><enddate>20150201</enddate><creator>Sheydaei, Navid</creator><creator>Saraee, Mohamad</creator><creator>Shahgholian, Azar</creator><general>SAGE Publications</general><general>Bowker-Saur Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>8BP</scope></search><sort><creationdate>20150201</creationdate><title>A novel feature selection method for text classification using association rules and clustering</title><author>Sheydaei, Navid ; Saraee, Mohamad ; Shahgholian, Azar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Acceptability</topic><topic>Accuracy</topic><topic>Algorithms</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Clustering</topic><topic>Feature extraction</topic><topic>Feature selection</topic><topic>Studies</topic><topic>Tasks</topic><topic>Text categorization</topic><topic>Texts</topic><topic>Training</topic><topic>Vocabularies &amp; taxonomies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sheydaei, Navid</creatorcontrib><creatorcontrib>Saraee, Mohamad</creatorcontrib><creatorcontrib>Shahgholian, Azar</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library &amp; Information Sciences Abstracts (LISA)</collection><collection>Library &amp; Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Library &amp; Information Sciences Abstracts (LISA) - CILIP Edition</collection><jtitle>Journal of information science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sheydaei, Navid</au><au>Saraee, Mohamad</au><au>Shahgholian, Azar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A novel feature selection method for text classification using association rules and clustering</atitle><jtitle>Journal of information science</jtitle><date>2015-02-01</date><risdate>2015</risdate><volume>41</volume><issue>1</issue><spage>3</spage><epage>15</epage><pages>3-15</pages><issn>0165-5515</issn><eissn>1741-6485</eissn><coden>JISCDI</coden><abstract>Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/0165551514550143</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0165-5515
ispartof Journal of information science, 2015-02, Vol.41 (1), p.3-15
issn 0165-5515
1741-6485
language eng
recordid cdi_proquest_miscellaneous_1667945517
source Library & Information Science Abstracts (LISA); SAGE
subjects Acceptability
Accuracy
Algorithms
Classification
Classifiers
Clustering
Feature extraction
Feature selection
Studies
Tasks
Text categorization
Texts
Training
Vocabularies & taxonomies
title A novel feature selection method for text classification using association rules and clustering
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T22%3A44%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20novel%20feature%20selection%20method%20for%20text%20classification%20using%20association%20rules%20and%20clustering&rft.jtitle=Journal%20of%20information%20science&rft.au=Sheydaei,%20Navid&rft.date=2015-02-01&rft.volume=41&rft.issue=1&rft.spage=3&rft.epage=15&rft.pages=3-15&rft.issn=0165-5515&rft.eissn=1741-6485&rft.coden=JISCDI&rft_id=info:doi/10.1177/0165551514550143&rft_dat=%3Cproquest_cross%3E3561163021%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1646331267&rft_id=info:pmid/&rft_sage_id=10.1177_0165551514550143&rfr_iscdi=true