Loading…

Random forest classification of etiologies for an orphan disease

Classification of objects into pre‐defined groups based on known information is a fundamental problem in the field of statistics. Although approaches for solving this problem exist, finding an accurate classification method can be challenging in an orphan disease setting, where data are minimal and...

Full description

Saved in:
Bibliographic Details
Published in:Statistics in medicine 2015-02, Vol.34 (5), p.887-899
Main Authors: Speiser, Jaime Lynn, Durkalski, Valerie L., Lee, William M.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Classification of objects into pre‐defined groups based on known information is a fundamental problem in the field of statistics. Although approaches for solving this problem exist, finding an accurate classification method can be challenging in an orphan disease setting, where data are minimal and often not normally distributed. The purpose of this paper is to illustrate the application of the random forest (RF) classification procedure in a real clinical setting and discuss typical questions that arise in the general classification framework as well as offer interpretations of RF results. This paper includes methods for assessing predictive performance, importance of predictor variables, and observation‐specific information. Copyright © 2014 John Wiley & Sons, Ltd.
ISSN:0277-6715
1097-0258
DOI:10.1002/sim.6351