Loading…

Assessment of evaluation criteria for survival prediction from genomic data

Survival prediction from high‐dimensional genomic data is dependent on a proper regularization method. With an increasing number of such methods proposed in the literature, comparative studies are called for and some have been performed. However, there is currently no consensus on which prediction a...

Full description

Saved in:
Bibliographic Details
Published in:Biometrical journal 2011-03, Vol.53 (2), p.202-216
Main Authors: Bøvelstad, Hege M, Borgan, Ornulf
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Survival prediction from high‐dimensional genomic data is dependent on a proper regularization method. With an increasing number of such methods proposed in the literature, comparative studies are called for and some have been performed. However, there is currently no consensus on which prediction assessment criterion should be used for time‐to‐event data. Without a firm knowledge about whether the choice of evaluation criterion may affect the conclusions made as to which regularization method performs best, these comparative studies may be of limited value. In this paper, four evaluation criteria are investigated: the log‐rank test for two groups, the area under the time‐dependent ROC curve (AUC), an R2‐measure based on the Cox partial likelihood, and an R2‐measure based on the Brier score. The criteria are compared according to how they rank six widely used regularization methods that are based on the Cox regression model, namely univariate selection, principal components regression (PCR), supervised PCR, partial least squares regression, ridge regression, and the lasso. Based on our application to three microarray gene expression data sets, we find that the results obtained from the widely used log‐rank test deviate from the other three criteria studied. For future studies, where one also might want to include non‐likelihood or non‐model‐based regularization methods, we argue in favor of AUC and the R2‐measure based on the Brier score, as these do not suffer from the arbitrary splitting into two groups nor depend on the Cox partial likelihood.
ISSN:0323-3847
1521-4036
1521-4036
DOI:10.1002/bimj.201000048