Loading…

Improved swarm-optimization-based filter-wrapper gene selection from microarray data for gene expression tumor classification

A typical microarray dataset usually contains thousands of genes, but only a small number of samples. It is in fact that most genes in a DNA microarray dataset are not relevant for classification. Identifying highly discriminating genes, known as biomarkers, is a challenging task for machine learnin...

Full description

Saved in:
Bibliographic Details
Published in:Pattern analysis and applications : PAA 2023-05, Vol.26 (2), p.455-472
Main Authors: Ke, Lin, Li, Min, Wang, Lei, Deng, Shaobo, Ye, Jun, Yu, Xiang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A typical microarray dataset usually contains thousands of genes, but only a small number of samples. It is in fact that most genes in a DNA microarray dataset are not relevant for classification. Identifying highly discriminating genes, known as biomarkers, is a challenging task for machine learning-based tumor classification. This study focuses on swarm-optimization-based filter-wrapper gene selection. In general, this type of hybrid gene selection consists of two steps: The first step is the filter step, which selects a small top-n percentage of genes and obtains reduced data; then, the second step searches for the optimal gene subset based on a wrapper model from the remaining genes by using a swarm-optimization-based algorithm. However, the second step of the existing swarm-optimization-based filter-wrapper gene selection is to search only from the remaining genes without using the ranking information of the remaining genes. This new study attempts to fill the gap that has been neglected in the area of swarm-optimization-based filter-wrapper gene selection. In this study, population initialization based on ranking criteria (PIRC) is proposed to transform the population initialization of genetic algorithm (GA) and ant colony optimization (ACO), which are called PIRCGA and PIRCACO, respectively. The experiment was carried out on 17 microarray expression datasets, and the two groups of IG-GA vs. IG-PIRCGA and IG-ACO vs. IG-PIRCACO were compared, respectively. The experimental results prove the efficiency of our proposed methods.
ISSN:1433-7541
1433-755X
DOI:10.1007/s10044-022-01117-9