Loading…
A novel feature selection method for text classification using association rules and clustering
Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification,...
Saved in:
Published in: | Journal of information science 2015-02, Vol.41 (1), p.3-15 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3 |
---|---|
cites | cdi_FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3 |
container_end_page | 15 |
container_issue | 1 |
container_start_page | 3 |
container_title | Journal of information science |
container_volume | 41 |
creator | Sheydaei, Navid Saraee, Mohamad Shahgholian, Azar |
description | Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance. |
doi_str_mv | 10.1177/0165551514550143 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_1667945517</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sage_id>10.1177_0165551514550143</sage_id><sourcerecordid>3561163021</sourcerecordid><originalsourceid>FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3</originalsourceid><addsrcrecordid>eNqNkc1LxDAQxYMouK7ePQa8eKlmmq_2uCx-wYKXvZeYTtcu3WZNUtH_3tR6kAXB08B7v3kwbwi5BHYDoPUtAyWlBAlCSgaCH5EZaAGZEoU8JrPRzkb_lJyFsGWMyZKLGakWtHfv2NEGTRw80oAd2ti6nu4wvrqaNs7TiB-R2s6E0DatNd_2ENp-Q5PkbDspfugwUNPXCR1CRJ-Ac3LSmC7gxc-ck_X93Xr5mK2eH56Wi1VmBeiYlaVAXTJR58I2zKhccsnVi2Z5UQuTPJC8kEbZ2sgc8ho51ozlpZWmboTlc3I9xe69exswxGrXBotdZ3p0Q6hAKV2mZkD_B2VMpb6KhF4doFs3-D7dkSihOIdcjYFsoqx3IXhsqr1vd8Z_VsCq8TfV4W_SSjatBLPBX6F_8V8TDI2D</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1646331267</pqid></control><display><type>article</type><title>A novel feature selection method for text classification using association rules and clustering</title><source>Library & Information Science Abstracts (LISA)</source><source>SAGE</source><creator>Sheydaei, Navid ; Saraee, Mohamad ; Shahgholian, Azar</creator><creatorcontrib>Sheydaei, Navid ; Saraee, Mohamad ; Shahgholian, Azar</creatorcontrib><description>Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.</description><identifier>ISSN: 0165-5515</identifier><identifier>EISSN: 1741-6485</identifier><identifier>DOI: 10.1177/0165551514550143</identifier><identifier>CODEN: JISCDI</identifier><language>eng</language><publisher>London, England: SAGE Publications</publisher><subject>Acceptability ; Accuracy ; Algorithms ; Classification ; Classifiers ; Clustering ; Feature extraction ; Feature selection ; Studies ; Tasks ; Text categorization ; Texts ; Training ; Vocabularies & taxonomies</subject><ispartof>Journal of information science, 2015-02, Vol.41 (1), p.3-15</ispartof><rights>The Author(s) 2014</rights><rights>Copyright Bowker-Saur Ltd. Feb 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3</citedby><cites>FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925,34135,34136,79364</link.rule.ids></links><search><creatorcontrib>Sheydaei, Navid</creatorcontrib><creatorcontrib>Saraee, Mohamad</creatorcontrib><creatorcontrib>Shahgholian, Azar</creatorcontrib><title>A novel feature selection method for text classification using association rules and clustering</title><title>Journal of information science</title><description>Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.</description><subject>Acceptability</subject><subject>Accuracy</subject><subject>Algorithms</subject><subject>Classification</subject><subject>Classifiers</subject><subject>Clustering</subject><subject>Feature extraction</subject><subject>Feature selection</subject><subject>Studies</subject><subject>Tasks</subject><subject>Text categorization</subject><subject>Texts</subject><subject>Training</subject><subject>Vocabularies & taxonomies</subject><issn>0165-5515</issn><issn>1741-6485</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><sourceid>F2A</sourceid><recordid>eNqNkc1LxDAQxYMouK7ePQa8eKlmmq_2uCx-wYKXvZeYTtcu3WZNUtH_3tR6kAXB08B7v3kwbwi5BHYDoPUtAyWlBAlCSgaCH5EZaAGZEoU8JrPRzkb_lJyFsGWMyZKLGakWtHfv2NEGTRw80oAd2ti6nu4wvrqaNs7TiB-R2s6E0DatNd_2ENp-Q5PkbDspfugwUNPXCR1CRJ-Ac3LSmC7gxc-ck_X93Xr5mK2eH56Wi1VmBeiYlaVAXTJR58I2zKhccsnVi2Z5UQuTPJC8kEbZ2sgc8ho51ozlpZWmboTlc3I9xe69exswxGrXBotdZ3p0Q6hAKV2mZkD_B2VMpb6KhF4doFs3-D7dkSihOIdcjYFsoqx3IXhsqr1vd8Z_VsCq8TfV4W_SSjatBLPBX6F_8V8TDI2D</recordid><startdate>20150201</startdate><enddate>20150201</enddate><creator>Sheydaei, Navid</creator><creator>Saraee, Mohamad</creator><creator>Shahgholian, Azar</creator><general>SAGE Publications</general><general>Bowker-Saur Ltd</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>8FD</scope><scope>E3H</scope><scope>F2A</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>8BP</scope></search><sort><creationdate>20150201</creationdate><title>A novel feature selection method for text classification using association rules and clustering</title><author>Sheydaei, Navid ; Saraee, Mohamad ; Shahgholian, Azar</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>Acceptability</topic><topic>Accuracy</topic><topic>Algorithms</topic><topic>Classification</topic><topic>Classifiers</topic><topic>Clustering</topic><topic>Feature extraction</topic><topic>Feature selection</topic><topic>Studies</topic><topic>Tasks</topic><topic>Text categorization</topic><topic>Texts</topic><topic>Training</topic><topic>Vocabularies & taxonomies</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sheydaei, Navid</creatorcontrib><creatorcontrib>Saraee, Mohamad</creatorcontrib><creatorcontrib>Shahgholian, Azar</creatorcontrib><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Technology Research Database</collection><collection>Library & Information Sciences Abstracts (LISA)</collection><collection>Library & Information Science Abstracts (LISA)</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>Library & Information Sciences Abstracts (LISA) - CILIP Edition</collection><jtitle>Journal of information science</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sheydaei, Navid</au><au>Saraee, Mohamad</au><au>Shahgholian, Azar</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A novel feature selection method for text classification using association rules and clustering</atitle><jtitle>Journal of information science</jtitle><date>2015-02-01</date><risdate>2015</risdate><volume>41</volume><issue>1</issue><spage>3</spage><epage>15</epage><pages>3-15</pages><issn>0165-5515</issn><eissn>1741-6485</eissn><coden>JISCDI</coden><abstract>Readability and accuracy are two important features of any good classifier. For reasons such as acceptable accuracy, rapid training and high interpretability, associative classifiers have recently been used in many categorization tasks. Although features could be very useful in text classification, both training time and the number of produced rules will increase significantly owing to the high dimensionality of text documents. In this paper an association classification algorithm for text classification is proposed that includes a feature selection phase to select important features and a clustering phase based on class labels to tackle this shortcoming. The experimental results from applying the proposed algorithm in comparison with the results of selected well-known classification algorithms show that our approach outperforms others both in efficiency and in performance.</abstract><cop>London, England</cop><pub>SAGE Publications</pub><doi>10.1177/0165551514550143</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0165-5515 |
ispartof | Journal of information science, 2015-02, Vol.41 (1), p.3-15 |
issn | 0165-5515 1741-6485 |
language | eng |
recordid | cdi_proquest_miscellaneous_1667945517 |
source | Library & Information Science Abstracts (LISA); SAGE |
subjects | Acceptability Accuracy Algorithms Classification Classifiers Clustering Feature extraction Feature selection Studies Tasks Text categorization Texts Training Vocabularies & taxonomies |
title | A novel feature selection method for text classification using association rules and clustering |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T22%3A44%3A08IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20novel%20feature%20selection%20method%20for%20text%20classification%20using%20association%20rules%20and%20clustering&rft.jtitle=Journal%20of%20information%20science&rft.au=Sheydaei,%20Navid&rft.date=2015-02-01&rft.volume=41&rft.issue=1&rft.spage=3&rft.epage=15&rft.pages=3-15&rft.issn=0165-5515&rft.eissn=1741-6485&rft.coden=JISCDI&rft_id=info:doi/10.1177/0165551514550143&rft_dat=%3Cproquest_cross%3E3561163021%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c417t-994e7904d24cf0a6253536b7028d4a4e715385a6cda5212de3ed0029c5adf4c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1646331267&rft_id=info:pmid/&rft_sage_id=10.1177_0165551514550143&rfr_iscdi=true |