Loading…
MSPL: Multimodal Self-Paced Learning for Multi-Omics Feature Selection and Data Integration
Rapid advances in high-throughput sequencing technology have led to the generation of a large number of multi-omics biological datasets. Integrating data from different omics provides an unprecedented opportunity to gain insight into disease mechanisms from different perspectives. However, integrati...
Saved in:
Published in: | IEEE access 2019, Vol.7, p.170513-170524 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Rapid advances in high-throughput sequencing technology have led to the generation of a large number of multi-omics biological datasets. Integrating data from different omics provides an unprecedented opportunity to gain insight into disease mechanisms from different perspectives. However, integrative analysis and predictive modeling from multi-omics data are facing three major challenges: i) heavy noises; ii) the high dimensions compared to the small samples; iii) data heterogeneity. Current multi-omics data integration approaches have some limitations and are susceptible to heavy noise. In this paper, we present MSPL, a robust supervised multi-omics data integration method that simultaneously identifies significant multi-omics signatures during the integration process and predicts the cancer subtypes. The proposed method not only inherits the generalization performance of self-paced learning but also leverages the properties of multi-omics data containing correlated information to interactively recommend high-confidence samples for model training. We demonstrate the capabilities of MSPL using simulated data and five multi-omics biological datasets, integrating up three omics to identify potential biological signatures, and evaluating the performance compared to state-of-the-art methods in binary and multi-class classification problems. Our proposed model makes multi-omics data integration more systematic and expands its range of applications. |
---|---|
ISSN: | 2169-3536 2169-3536 |
DOI: | 10.1109/ACCESS.2019.2955958 |