Loading…
Feature clustering based support vector machine recursive feature elimination for gene selection
In a DNA microarray dataset, gene expression data often has a huge number of features(which are referred to as genes) versus a small size of samples. With the development of DNA microarray technology, the number of dimensions increases even faster than before, which could lead to the problem of the...
Saved in:
Published in: | Applied intelligence (Dordrecht, Netherlands) Netherlands), 2018-03, Vol.48 (3), p.594-607 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c382t-ac3678ab7fc6d51c60d8ed3ee305a142cd0de88f9344892f9782d79d73caea903 |
---|---|
cites | cdi_FETCH-LOGICAL-c382t-ac3678ab7fc6d51c60d8ed3ee305a142cd0de88f9344892f9782d79d73caea903 |
container_end_page | 607 |
container_issue | 3 |
container_start_page | 594 |
container_title | Applied intelligence (Dordrecht, Netherlands) |
container_volume | 48 |
creator | Huang, Xiaojuan Zhang, Li Wang, Bangjun Li, Fanzhang Zhang, Zhao |
description | In a DNA microarray dataset, gene expression data often has a huge number of features(which are referred to as genes) versus a small size of samples. With the development of DNA microarray technology, the number of dimensions increases even faster than before, which could lead to the problem of the curse of dimensionality. To get good classification performance, it is necessary to preprocess the gene expression data. Support vector machine recursive feature elimination (SVM-RFE) is a classical method for gene selection. However, SVM-RFE suffers from high computational complexity. To remedy it, this paper enhances SVM-RFE for gene selection by incorporating feature clustering, called feature clustering SVM-RFE (FCSVM-RFE). The proposed method first performs gene selection roughly and then ranks the selected genes. First, a clustering algorithm is used to cluster genes into gene groups, in each which genes have similar expression profile. Then, a representative gene is found to represent a gene group. By doing so, we can obtain a representative gene set. Then, SVM-RFE is applied to rank these representative genes. FCSVM-RFE can reduce the computational complexity and the redundancy among genes. Experiments on seven public gene expression datasets show that FCSVM-RFE can achieve a better classification performance and lower computational complexity when compared with the state-the-art-of methods, such as SVM-RFE. |
doi_str_mv | 10.1007/s10489-017-0992-2 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_1999510920</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>1999510920</sourcerecordid><originalsourceid>FETCH-LOGICAL-c382t-ac3678ab7fc6d51c60d8ed3ee305a142cd0de88f9344892f9782d79d73caea903</originalsourceid><addsrcrecordid>eNp1kE1LxDAQhoMouH78AG8Bz9FJ0jbNURZXhQUvCt5iNpmuXbptTdoF_70p3YMXTwPD-7zDPITccLjjAOo-cshKzYArBloLJk7IgudKMpVpdUoWoEXGikJ_nJOLGHcAICXwBflcoR3GgNQ1Yxww1O2WbmxET-PY910Y6AHd0AW6t-6rbpEGdGOI9QFpdSSxqfd1a4e6a2mVkltMsYhN4tLqipxVtol4fZyX5H31-LZ8ZuvXp5flw5o5WYqBWScLVdqNqlzhc-4K8CV6iSghtzwTzoPHsqy0zNKfotKqFF5pr6SzaDXIS3I79_ah-x4xDmbXjaFNJw3XWuc8GZhSfE650MUYsDJ9qPc2_BgOZhJpZpEmiTSTSCMSI2Ym9pMeDH-a_4V-AWnKeCA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1999510920</pqid></control><display><type>article</type><title>Feature clustering based support vector machine recursive feature elimination for gene selection</title><source>ABI/INFORM Global</source><source>Springer Link</source><creator>Huang, Xiaojuan ; Zhang, Li ; Wang, Bangjun ; Li, Fanzhang ; Zhang, Zhao</creator><creatorcontrib>Huang, Xiaojuan ; Zhang, Li ; Wang, Bangjun ; Li, Fanzhang ; Zhang, Zhao</creatorcontrib><description>In a DNA microarray dataset, gene expression data often has a huge number of features(which are referred to as genes) versus a small size of samples. With the development of DNA microarray technology, the number of dimensions increases even faster than before, which could lead to the problem of the curse of dimensionality. To get good classification performance, it is necessary to preprocess the gene expression data. Support vector machine recursive feature elimination (SVM-RFE) is a classical method for gene selection. However, SVM-RFE suffers from high computational complexity. To remedy it, this paper enhances SVM-RFE for gene selection by incorporating feature clustering, called feature clustering SVM-RFE (FCSVM-RFE). The proposed method first performs gene selection roughly and then ranks the selected genes. First, a clustering algorithm is used to cluster genes into gene groups, in each which genes have similar expression profile. Then, a representative gene is found to represent a gene group. By doing so, we can obtain a representative gene set. Then, SVM-RFE is applied to rank these representative genes. FCSVM-RFE can reduce the computational complexity and the redundancy among genes. Experiments on seven public gene expression datasets show that FCSVM-RFE can achieve a better classification performance and lower computational complexity when compared with the state-the-art-of methods, such as SVM-RFE.</description><identifier>ISSN: 0924-669X</identifier><identifier>EISSN: 1573-7497</identifier><identifier>DOI: 10.1007/s10489-017-0992-2</identifier><language>eng</language><publisher>New York: Springer US</publisher><subject>Artificial Intelligence ; Classification ; Clustering ; Complexity ; Computation ; Computer Science ; Deoxyribonucleic acid ; DNA ; Gene expression ; Genes ; Machines ; Manufacturing ; Mechanical Engineering ; Processes ; Recursive methods ; Redundancy ; Support vector machines</subject><ispartof>Applied intelligence (Dordrecht, Netherlands), 2018-03, Vol.48 (3), p.594-607</ispartof><rights>Springer Science+Business Media New York 2017</rights><rights>Applied Intelligence is a copyright of Springer, (2017). All Rights Reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c382t-ac3678ab7fc6d51c60d8ed3ee305a142cd0de88f9344892f9782d79d73caea903</citedby><cites>FETCH-LOGICAL-c382t-ac3678ab7fc6d51c60d8ed3ee305a142cd0de88f9344892f9782d79d73caea903</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1999510920/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1999510920?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>314,780,784,11688,27924,27925,36060,44363,74895</link.rule.ids></links><search><creatorcontrib>Huang, Xiaojuan</creatorcontrib><creatorcontrib>Zhang, Li</creatorcontrib><creatorcontrib>Wang, Bangjun</creatorcontrib><creatorcontrib>Li, Fanzhang</creatorcontrib><creatorcontrib>Zhang, Zhao</creatorcontrib><title>Feature clustering based support vector machine recursive feature elimination for gene selection</title><title>Applied intelligence (Dordrecht, Netherlands)</title><addtitle>Appl Intell</addtitle><description>In a DNA microarray dataset, gene expression data often has a huge number of features(which are referred to as genes) versus a small size of samples. With the development of DNA microarray technology, the number of dimensions increases even faster than before, which could lead to the problem of the curse of dimensionality. To get good classification performance, it is necessary to preprocess the gene expression data. Support vector machine recursive feature elimination (SVM-RFE) is a classical method for gene selection. However, SVM-RFE suffers from high computational complexity. To remedy it, this paper enhances SVM-RFE for gene selection by incorporating feature clustering, called feature clustering SVM-RFE (FCSVM-RFE). The proposed method first performs gene selection roughly and then ranks the selected genes. First, a clustering algorithm is used to cluster genes into gene groups, in each which genes have similar expression profile. Then, a representative gene is found to represent a gene group. By doing so, we can obtain a representative gene set. Then, SVM-RFE is applied to rank these representative genes. FCSVM-RFE can reduce the computational complexity and the redundancy among genes. Experiments on seven public gene expression datasets show that FCSVM-RFE can achieve a better classification performance and lower computational complexity when compared with the state-the-art-of methods, such as SVM-RFE.</description><subject>Artificial Intelligence</subject><subject>Classification</subject><subject>Clustering</subject><subject>Complexity</subject><subject>Computation</subject><subject>Computer Science</subject><subject>Deoxyribonucleic acid</subject><subject>DNA</subject><subject>Gene expression</subject><subject>Genes</subject><subject>Machines</subject><subject>Manufacturing</subject><subject>Mechanical Engineering</subject><subject>Processes</subject><subject>Recursive methods</subject><subject>Redundancy</subject><subject>Support vector machines</subject><issn>0924-669X</issn><issn>1573-7497</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2018</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp1kE1LxDAQhoMouH78AG8Bz9FJ0jbNURZXhQUvCt5iNpmuXbptTdoF_70p3YMXTwPD-7zDPITccLjjAOo-cshKzYArBloLJk7IgudKMpVpdUoWoEXGikJ_nJOLGHcAICXwBflcoR3GgNQ1Yxww1O2WbmxET-PY910Y6AHd0AW6t-6rbpEGdGOI9QFpdSSxqfd1a4e6a2mVkltMsYhN4tLqipxVtol4fZyX5H31-LZ8ZuvXp5flw5o5WYqBWScLVdqNqlzhc-4K8CV6iSghtzwTzoPHsqy0zNKfotKqFF5pr6SzaDXIS3I79_ah-x4xDmbXjaFNJw3XWuc8GZhSfE650MUYsDJ9qPc2_BgOZhJpZpEmiTSTSCMSI2Ym9pMeDH-a_4V-AWnKeCA</recordid><startdate>20180301</startdate><enddate>20180301</enddate><creator>Huang, Xiaojuan</creator><creator>Zhang, Li</creator><creator>Wang, Bangjun</creator><creator>Li, Fanzhang</creator><creator>Zhang, Zhao</creator><general>Springer US</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L6V</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>M7S</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PSYQQ</scope><scope>PTHSS</scope><scope>Q9U</scope></search><sort><creationdate>20180301</creationdate><title>Feature clustering based support vector machine recursive feature elimination for gene selection</title><author>Huang, Xiaojuan ; Zhang, Li ; Wang, Bangjun ; Li, Fanzhang ; Zhang, Zhao</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c382t-ac3678ab7fc6d51c60d8ed3ee305a142cd0de88f9344892f9782d79d73caea903</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2018</creationdate><topic>Artificial Intelligence</topic><topic>Classification</topic><topic>Clustering</topic><topic>Complexity</topic><topic>Computation</topic><topic>Computer Science</topic><topic>Deoxyribonucleic acid</topic><topic>DNA</topic><topic>Gene expression</topic><topic>Genes</topic><topic>Machines</topic><topic>Manufacturing</topic><topic>Mechanical Engineering</topic><topic>Processes</topic><topic>Recursive methods</topic><topic>Redundancy</topic><topic>Support vector machines</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Huang, Xiaojuan</creatorcontrib><creatorcontrib>Zhang, Li</creatorcontrib><creatorcontrib>Wang, Bangjun</creatorcontrib><creatorcontrib>Li, Fanzhang</creatorcontrib><creatorcontrib>Zhang, Zhao</creatorcontrib><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Global (Alumni Edition)</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer science database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>ProQuest Engineering Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Global</collection><collection>Computing Database</collection><collection>Engineering Database</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>ProQuest One Business</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest One Psychology</collection><collection>Engineering collection</collection><collection>ProQuest Central Basic</collection><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Huang, Xiaojuan</au><au>Zhang, Li</au><au>Wang, Bangjun</au><au>Li, Fanzhang</au><au>Zhang, Zhao</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Feature clustering based support vector machine recursive feature elimination for gene selection</atitle><jtitle>Applied intelligence (Dordrecht, Netherlands)</jtitle><stitle>Appl Intell</stitle><date>2018-03-01</date><risdate>2018</risdate><volume>48</volume><issue>3</issue><spage>594</spage><epage>607</epage><pages>594-607</pages><issn>0924-669X</issn><eissn>1573-7497</eissn><abstract>In a DNA microarray dataset, gene expression data often has a huge number of features(which are referred to as genes) versus a small size of samples. With the development of DNA microarray technology, the number of dimensions increases even faster than before, which could lead to the problem of the curse of dimensionality. To get good classification performance, it is necessary to preprocess the gene expression data. Support vector machine recursive feature elimination (SVM-RFE) is a classical method for gene selection. However, SVM-RFE suffers from high computational complexity. To remedy it, this paper enhances SVM-RFE for gene selection by incorporating feature clustering, called feature clustering SVM-RFE (FCSVM-RFE). The proposed method first performs gene selection roughly and then ranks the selected genes. First, a clustering algorithm is used to cluster genes into gene groups, in each which genes have similar expression profile. Then, a representative gene is found to represent a gene group. By doing so, we can obtain a representative gene set. Then, SVM-RFE is applied to rank these representative genes. FCSVM-RFE can reduce the computational complexity and the redundancy among genes. Experiments on seven public gene expression datasets show that FCSVM-RFE can achieve a better classification performance and lower computational complexity when compared with the state-the-art-of methods, such as SVM-RFE.</abstract><cop>New York</cop><pub>Springer US</pub><doi>10.1007/s10489-017-0992-2</doi><tpages>14</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0924-669X |
ispartof | Applied intelligence (Dordrecht, Netherlands), 2018-03, Vol.48 (3), p.594-607 |
issn | 0924-669X 1573-7497 |
language | eng |
recordid | cdi_proquest_journals_1999510920 |
source | ABI/INFORM Global; Springer Link |
subjects | Artificial Intelligence Classification Clustering Complexity Computation Computer Science Deoxyribonucleic acid DNA Gene expression Genes Machines Manufacturing Mechanical Engineering Processes Recursive methods Redundancy Support vector machines |
title | Feature clustering based support vector machine recursive feature elimination for gene selection |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-30T01%3A22%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Feature%20clustering%20based%20support%20vector%20machine%20recursive%20feature%20elimination%20for%20gene%20selection&rft.jtitle=Applied%20intelligence%20(Dordrecht,%20Netherlands)&rft.au=Huang,%20Xiaojuan&rft.date=2018-03-01&rft.volume=48&rft.issue=3&rft.spage=594&rft.epage=607&rft.pages=594-607&rft.issn=0924-669X&rft.eissn=1573-7497&rft_id=info:doi/10.1007/s10489-017-0992-2&rft_dat=%3Cproquest_cross%3E1999510920%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c382t-ac3678ab7fc6d51c60d8ed3ee305a142cd0de88f9344892f9782d79d73caea903%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1999510920&rft_id=info:pmid/&rfr_iscdi=true |