Loading…

The Estimation of Prediction Error: Covariance Penalties and Cross-Validation

Having constructed a data-based estimation rule, perhaps a logistic regression or a classification tree, the statistician would like to know its performance as a predictor of future cases. There are two main theories concerning prediction error: (1) penalty methods such as C p , Akaike's inform...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of the American Statistical Association 2004-09, Vol.99 (467), p.619-632
Main Author:	Efron, Bradley
Format:	Article
Language:	English
Subjects:	Covariance Data analysis Data collection Degrees of freedom Error Estimation methods Forecasts Linear regression Methodology Modeling Nonparametric estimates Optimism Parametric bootstrap Parametric models Penalty function Polynomials Probability Rao-Blackwellization Statistical methods Statistical variance Statistics SURE Theory and Methods
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Having constructed a data-based estimation rule, perhaps a logistic regression or a classification tree, the statistician would like to know its performance as a predictor of future cases. There are two main theories concerning prediction error: (1) penalty methods such as C p , Akaike's information criterion, and Stein's unbiased risk estimate that depend on the covariance between data points and their corresponding predictions; and (2) cross-validation and related nonparametric bootstrap techniques. This article concerns the connection between the two theories. A Rao-Blackwell type of relation is derived in which nonparametric methods such as cross-validation are seen to be randomized versions of their covariance penalty counterparts. The model-based penalty methods offer substantially better accuracy, assuming that the model is believable.
ISSN:	0162-1459 1537-274X
DOI:	10.1198/016214504000000692