Loading…

Bagging survival tree procedure for variable selection and prediction in the presence of nonsusceptible patients

For clinical genomic studies with high-dimensional datasets, tree-based ensemble methods offer a powerful solution for variable selection and prediction taking into account the complex interrelationships between explanatory variables. One of the key component of the tree-building process is the spli...

Full description

Saved in:
Bibliographic Details
Published in:BMC bioinformatics 2016-06, Vol.17 (1), p.230-230, Article 230
Main Authors: Mbogning, Cyprien, Broët, Philippe
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:For clinical genomic studies with high-dimensional datasets, tree-based ensemble methods offer a powerful solution for variable selection and prediction taking into account the complex interrelationships between explanatory variables. One of the key component of the tree-building process is the splitting criterion. For survival data, the classical splitting criterion is the Logrank statistic. However, the presence of a fraction of nonsusceptible patients in the studied population advocates for considering a criterion tailored to this peculiar situation. We propose a bagging survival tree procedure for variable selection and prediction where the survival tree-building process relies on a splitting criterion that explicitly focuses on time-to-event survival distribution among susceptible patients. A simulation study shows that our method achieves good performance for the variable selection and prediction. Different criteria for evaluating the importance of the explanatory variables and the prediction performance are reported. Our procedure is illustrated on a genomic dataset with gene expression measurements from early breast cancer patients. In the presence of nonsusceptible patients among the studied population, our procedure represents an efficient way to select event-related explanatory covariates with potential higher-order interaction and identify homogeneous groups of susceptible patients.
ISSN:1471-2105
1471-2105
DOI:10.1186/s12859-016-1090-x