Loading…

RIFS2D: A two-dimensional version of a randomly restarted incremental feature selection algorithm with an application for detecting low-ranked biomarkers

The era of big data introduces both opportunities and challenges for biomedical researchers. One of the inherent difficulties in the biomedical research field is to recruit large cohorts of samples, while high-throughput biotechnologies may produce thousands or even millions of features for each sam...

Full description

Saved in:
Bibliographic Details
Published in:Computers in biology and medicine 2021-06, Vol.133, p.104405-104405, Article 104405
Main Authors: Gao, Sida, Wang, Puli, Feng, Yuming, Xie, Xuchen, Duan, Meiyu, Fan, Yusi, Liu, Shuai, Huang, Lan, Zhou, Fengfeng
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The era of big data introduces both opportunities and challenges for biomedical researchers. One of the inherent difficulties in the biomedical research field is to recruit large cohorts of samples, while high-throughput biotechnologies may produce thousands or even millions of features for each sample. Researchers tend to evaluate the individual correlation of each feature with the class label and use the incremental feature selection (IFS) strategy to select the top-ranked features with the best prediction performance. Recent experimental data showed that a subset of continuously ranked features randomly restarted from a low-ranked feature (an RIFS block) may outperform the subset of top-ranked features. This study proposed a feature selection Algorithm RIFS2D by integrating multiple RIFS blocks. A comprehensive comparative experiment was conducted with the IFS, RIFS and existing feature selection algorithms and demonstrated that a subset of low-ranked features may also achieve promising prediction performance. This study suggested that a prediction model with promising performance may be trained by low-ranked features, even when top-ranked features did not achieve satisfying prediction performance. Further comparative experiments were conducted between RIFS2D and t-tests for the detection of early-stage breast cancer. The data showed that the RIFS2D-recommended features achieved better prediction accuracy and were targeted by more drugs than the t-test top-ranked features. •Many studies tend to select biomarkers from the top features ranked by filters.•This study hypothesizes that a group of lowly-ranked features may perform better.•The proposed Algorithm RIFS2D supported the hypothesis.•RIFS2D outperformed the filter, wrapper, and hybrid feature selection algorithms.•The RIFS2D features were targeted by more drugs than the T-test top-ranked ones.
ISSN:0010-4825
1879-0534
DOI:10.1016/j.compbiomed.2021.104405