Loading…

An integrated inverse space sparse representation framework for tumor classification

•An integrated inverse space sparse representation model is proposed for gene-based tumor classification.•A gene selection method is proposed to improve the model's representation ability to small sample problem.•A feature representation learning method is proposed to enhance the model's r...

Full description

Saved in:
Bibliographic Details
Published in:Pattern recognition 2019-09, Vol.93, p.293-311
Main Authors: Yang, Xiaohui, Wu, Wenming, Chen, Yunmei, Li, Xianqi, Zhang, Juan, Long, Dan, Yang, Lijun
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•An integrated inverse space sparse representation model is proposed for gene-based tumor classification.•A gene selection method is proposed to improve the model's representation ability to small sample problem.•A feature representation learning method is proposed to enhance the model's representation ability and stability.•The model is optimized and the convergence is analyzed.•Extensive experiments are conducted on six microarray gene expression datasets, which contain early diagnosis, tumor type recognition and postoperative metastasis. Microarray gene expression data-based tumor classification is an active and challenging issue. In this paper, an integrated tumor classification framework is presented, which aims to exploit information in existing available samples, and focuses on the small sample problem and unbalanced classification problem. Firstly, an inverse space sparse representation based classification (ISSRC) model is proposed by considering the characteristics of gene-based tumor data, such as sparsity and a small number of training samples. A decision information factors (DIF)-based gene selection method is constructed to enhance the representation ability of the ISSRC. It is worth noting that the DIF is established from reducing clinical misdiagnosis rate and dimension of small sample data. For further improving the representation ability and classification stability of the ISSRC, feature learning is conducted on the selected gene subset. The feature learning method is constructed by complementing the advantages of non-negative matrix factorization (NMF) and deep learning. Without confusion, the ISSRC combined with gene selection and feature learning is called the integrated ISSRC, whose stability, optimization and the corresponding convergence are analyzed. Extensive experiments on six public microarray gene expression datasets show the integrated ISSRC-based tumor classification framework is superior to classical and state-of-the-art methods. There are significant improvements in classification accuracy, specificity and sensitivity, whether there is a tumor in the early diagnosis, what kind of tumor, or whether metastasis occurs after tumor surgery.
ISSN:0031-3203
1873-5142
DOI:10.1016/j.patcog.2019.04.013