Loading…
Random forest classification of etiologies for an orphan disease
Classification of objects into pre‐defined groups based on known information is a fundamental problem in the field of statistics. Although approaches for solving this problem exist, finding an accurate classification method can be challenging in an orphan disease setting, where data are minimal and...
Saved in:
Published in: | Statistics in medicine 2015-02, Vol.34 (5), p.887-899 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c5421-9681a73a51fad56ce45a7ad5d9fa51ca05235b5ae41365887e31e43c232806c3 |
---|---|
cites | cdi_FETCH-LOGICAL-c5421-9681a73a51fad56ce45a7ad5d9fa51ca05235b5ae41365887e31e43c232806c3 |
container_end_page | 899 |
container_issue | 5 |
container_start_page | 887 |
container_title | Statistics in medicine |
container_volume | 34 |
creator | Speiser, Jaime Lynn Durkalski, Valerie L. Lee, William M. |
description | Classification of objects into pre‐defined groups based on known information is a fundamental problem in the field of statistics. Although approaches for solving this problem exist, finding an accurate classification method can be challenging in an orphan disease setting, where data are minimal and often not normally distributed. The purpose of this paper is to illustrate the application of the random forest (RF) classification procedure in a real clinical setting and discuss typical questions that arise in the general classification framework as well as offer interpretations of RF results. This paper includes methods for assessing predictive performance, importance of predictor variables, and observation‐specific information. Copyright © 2014 John Wiley & Sons, Ltd. |
doi_str_mv | 10.1002/sim.6351 |
format | article |
fullrecord | <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4310784</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3615255631</sourcerecordid><originalsourceid>FETCH-LOGICAL-c5421-9681a73a51fad56ce45a7ad5d9fa51ca05235b5ae41365887e31e43c232806c3</originalsourceid><addsrcrecordid>eNp1kd9rFDEQx0NR2rMK_QtkwZe-bM2PnWT3pVgOPQvnCXrgY5hmZ9u0u5trcqf2vzdHz6sK5mXC5MOHmXwZOxH8THAu3yY_nGkF4oBNBG9MySXUz9iES2NKbQQcsRcp3XIuBEhzyI4kKJ2PmbB3X3Bsw1B0IVJaF67HlHznHa59GIvQFZQvfbj2lLZMgbkZVze5tD4RJnrJnnfYJ3q1q8ds-eH9cvqxnH-eXU4v5qWDSoqy0bVAoxBEhy1oRxWgybe26XLPIQep4AqQKqE01LUhJahSTipZc-3UMTt_1K42VwO1jsZ1xN6uoh8wPtiA3v79Mvobex2-20oJbuoqC053ghjuN3lXO_jkqO9xpLBJVmiQlagbAxl98w96GzZxzNtlSnOj6_x7T0IXQ0qRuv0wgtttKjanYrepZPT1n8Pvwd8xZKB8BH74nh7-K7JfLz_thDvepzX93PMY72y2GbDfFjO7aJaLuZ6Cnalf5Cqk7Q</addsrcrecordid><sourcetype>Open Access Repository</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1660768366</pqid></control><display><type>article</type><title>Random forest classification of etiologies for an orphan disease</title><source>Wiley-Blackwell Read & Publish Collection</source><creator>Speiser, Jaime Lynn ; Durkalski, Valerie L. ; Lee, William M.</creator><creatorcontrib>Speiser, Jaime Lynn ; Durkalski, Valerie L. ; Lee, William M.</creatorcontrib><description>Classification of objects into pre‐defined groups based on known information is a fundamental problem in the field of statistics. Although approaches for solving this problem exist, finding an accurate classification method can be challenging in an orphan disease setting, where data are minimal and often not normally distributed. The purpose of this paper is to illustrate the application of the random forest (RF) classification procedure in a real clinical setting and discuss typical questions that arise in the general classification framework as well as offer interpretations of RF results. This paper includes methods for assessing predictive performance, importance of predictor variables, and observation‐specific information. Copyright © 2014 John Wiley & Sons, Ltd.</description><identifier>ISSN: 0277-6715</identifier><identifier>EISSN: 1097-0258</identifier><identifier>DOI: 10.1002/sim.6351</identifier><identifier>PMID: 25366667</identifier><identifier>CODEN: SMEDDA</identifier><language>eng</language><publisher>England: Blackwell Publishing Ltd</publisher><subject>acute liver failure ; Algorithms ; Biostatistics ; Classification ; Decision Trees ; etiology ; Humans ; Liver Failure, Acute - classification ; Liver Failure, Acute - etiology ; Machine Learning ; Medical statistics ; Models, Statistical ; Normal distribution ; random forest ; Rare Diseases - classification ; Rare Diseases - etiology ; Registries - statistics & numerical data ; statistical classification</subject><ispartof>Statistics in medicine, 2015-02, Vol.34 (5), p.887-899</ispartof><rights>Copyright © 2014 John Wiley & Sons, Ltd.</rights><rights>Copyright Wiley Subscription Services, Inc. Feb 28, 2015</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c5421-9681a73a51fad56ce45a7ad5d9fa51ca05235b5ae41365887e31e43c232806c3</citedby><cites>FETCH-LOGICAL-c5421-9681a73a51fad56ce45a7ad5d9fa51ca05235b5ae41365887e31e43c232806c3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>230,314,776,780,881,27901,27902</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/25366667$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Speiser, Jaime Lynn</creatorcontrib><creatorcontrib>Durkalski, Valerie L.</creatorcontrib><creatorcontrib>Lee, William M.</creatorcontrib><title>Random forest classification of etiologies for an orphan disease</title><title>Statistics in medicine</title><addtitle>Statist. Med</addtitle><description>Classification of objects into pre‐defined groups based on known information is a fundamental problem in the field of statistics. Although approaches for solving this problem exist, finding an accurate classification method can be challenging in an orphan disease setting, where data are minimal and often not normally distributed. The purpose of this paper is to illustrate the application of the random forest (RF) classification procedure in a real clinical setting and discuss typical questions that arise in the general classification framework as well as offer interpretations of RF results. This paper includes methods for assessing predictive performance, importance of predictor variables, and observation‐specific information. Copyright © 2014 John Wiley & Sons, Ltd.</description><subject>acute liver failure</subject><subject>Algorithms</subject><subject>Biostatistics</subject><subject>Classification</subject><subject>Decision Trees</subject><subject>etiology</subject><subject>Humans</subject><subject>Liver Failure, Acute - classification</subject><subject>Liver Failure, Acute - etiology</subject><subject>Machine Learning</subject><subject>Medical statistics</subject><subject>Models, Statistical</subject><subject>Normal distribution</subject><subject>random forest</subject><subject>Rare Diseases - classification</subject><subject>Rare Diseases - etiology</subject><subject>Registries - statistics & numerical data</subject><subject>statistical classification</subject><issn>0277-6715</issn><issn>1097-0258</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2015</creationdate><recordtype>article</recordtype><recordid>eNp1kd9rFDEQx0NR2rMK_QtkwZe-bM2PnWT3pVgOPQvnCXrgY5hmZ9u0u5trcqf2vzdHz6sK5mXC5MOHmXwZOxH8THAu3yY_nGkF4oBNBG9MySXUz9iES2NKbQQcsRcp3XIuBEhzyI4kKJ2PmbB3X3Bsw1B0IVJaF67HlHznHa59GIvQFZQvfbj2lLZMgbkZVze5tD4RJnrJnnfYJ3q1q8ds-eH9cvqxnH-eXU4v5qWDSoqy0bVAoxBEhy1oRxWgybe26XLPIQep4AqQKqE01LUhJahSTipZc-3UMTt_1K42VwO1jsZ1xN6uoh8wPtiA3v79Mvobex2-20oJbuoqC053ghjuN3lXO_jkqO9xpLBJVmiQlagbAxl98w96GzZxzNtlSnOj6_x7T0IXQ0qRuv0wgtttKjanYrepZPT1n8Pvwd8xZKB8BH74nh7-K7JfLz_thDvepzX93PMY72y2GbDfFjO7aJaLuZ6Cnalf5Cqk7Q</recordid><startdate>20150228</startdate><enddate>20150228</enddate><creator>Speiser, Jaime Lynn</creator><creator>Durkalski, Valerie L.</creator><creator>Lee, William M.</creator><general>Blackwell Publishing Ltd</general><general>Wiley Subscription Services, Inc</general><scope>BSCLL</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>K9.</scope><scope>7X8</scope><scope>5PM</scope></search><sort><creationdate>20150228</creationdate><title>Random forest classification of etiologies for an orphan disease</title><author>Speiser, Jaime Lynn ; Durkalski, Valerie L. ; Lee, William M.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c5421-9681a73a51fad56ce45a7ad5d9fa51ca05235b5ae41365887e31e43c232806c3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2015</creationdate><topic>acute liver failure</topic><topic>Algorithms</topic><topic>Biostatistics</topic><topic>Classification</topic><topic>Decision Trees</topic><topic>etiology</topic><topic>Humans</topic><topic>Liver Failure, Acute - classification</topic><topic>Liver Failure, Acute - etiology</topic><topic>Machine Learning</topic><topic>Medical statistics</topic><topic>Models, Statistical</topic><topic>Normal distribution</topic><topic>random forest</topic><topic>Rare Diseases - classification</topic><topic>Rare Diseases - etiology</topic><topic>Registries - statistics & numerical data</topic><topic>statistical classification</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Speiser, Jaime Lynn</creatorcontrib><creatorcontrib>Durkalski, Valerie L.</creatorcontrib><creatorcontrib>Lee, William M.</creatorcontrib><collection>Istex</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Health & Medical Complete (Alumni)</collection><collection>MEDLINE - Academic</collection><collection>PubMed Central (Full Participant titles)</collection><jtitle>Statistics in medicine</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Speiser, Jaime Lynn</au><au>Durkalski, Valerie L.</au><au>Lee, William M.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Random forest classification of etiologies for an orphan disease</atitle><jtitle>Statistics in medicine</jtitle><addtitle>Statist. Med</addtitle><date>2015-02-28</date><risdate>2015</risdate><volume>34</volume><issue>5</issue><spage>887</spage><epage>899</epage><pages>887-899</pages><issn>0277-6715</issn><eissn>1097-0258</eissn><coden>SMEDDA</coden><abstract>Classification of objects into pre‐defined groups based on known information is a fundamental problem in the field of statistics. Although approaches for solving this problem exist, finding an accurate classification method can be challenging in an orphan disease setting, where data are minimal and often not normally distributed. The purpose of this paper is to illustrate the application of the random forest (RF) classification procedure in a real clinical setting and discuss typical questions that arise in the general classification framework as well as offer interpretations of RF results. This paper includes methods for assessing predictive performance, importance of predictor variables, and observation‐specific information. Copyright © 2014 John Wiley & Sons, Ltd.</abstract><cop>England</cop><pub>Blackwell Publishing Ltd</pub><pmid>25366667</pmid><doi>10.1002/sim.6351</doi><tpages>13</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0277-6715 |
ispartof | Statistics in medicine, 2015-02, Vol.34 (5), p.887-899 |
issn | 0277-6715 1097-0258 |
language | eng |
recordid | cdi_pubmedcentral_primary_oai_pubmedcentral_nih_gov_4310784 |
source | Wiley-Blackwell Read & Publish Collection |
subjects | acute liver failure Algorithms Biostatistics Classification Decision Trees etiology Humans Liver Failure, Acute - classification Liver Failure, Acute - etiology Machine Learning Medical statistics Models, Statistical Normal distribution random forest Rare Diseases - classification Rare Diseases - etiology Registries - statistics & numerical data statistical classification |
title | Random forest classification of etiologies for an orphan disease |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-06T15%3A38%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Random%20forest%20classification%20of%20etiologies%20for%20an%20orphan%20disease&rft.jtitle=Statistics%20in%20medicine&rft.au=Speiser,%20Jaime%20Lynn&rft.date=2015-02-28&rft.volume=34&rft.issue=5&rft.spage=887&rft.epage=899&rft.pages=887-899&rft.issn=0277-6715&rft.eissn=1097-0258&rft.coden=SMEDDA&rft_id=info:doi/10.1002/sim.6351&rft_dat=%3Cproquest_pubme%3E3615255631%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c5421-9681a73a51fad56ce45a7ad5d9fa51ca05235b5ae41365887e31e43c232806c3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1660768366&rft_id=info:pmid/25366667&rfr_iscdi=true |