Loading…

Sample sizes of prediction model studies in prostate cancer were rarely justified and often insufficient

Developing clinical prediction models (CPMs) on data of sufficient sample size is critical to help minimize overfitting. Using prostate cancer as a clinical exemplar, we aimed to investigate to what extent existing CPMs adhere to recent formal sample size criteria, or historic rules of thumb of even...

Full description

Saved in:
Bibliographic Details
Published in:Journal of clinical epidemiology 2021-05, Vol.133, p.53-60
Main Authors: Collins, Shane D., Peek, Niels, Riley, Richard D., Martin, Glen P.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c444t-a2f603f03ff804a9781790a9071da8977217d7e838ac928bc47270b3c21a764f3
cites cdi_FETCH-LOGICAL-c444t-a2f603f03ff804a9781790a9071da8977217d7e838ac928bc47270b3c21a764f3
container_end_page 60
container_issue
container_start_page 53
container_title Journal of clinical epidemiology
container_volume 133
creator Collins, Shane D.
Peek, Niels
Riley, Richard D.
Martin, Glen P.
description Developing clinical prediction models (CPMs) on data of sufficient sample size is critical to help minimize overfitting. Using prostate cancer as a clinical exemplar, we aimed to investigate to what extent existing CPMs adhere to recent formal sample size criteria, or historic rules of thumb of events per predictor parameter (EPP)≥10. A systematic review to identify CPMs related to prostate cancer, which provided enough information to calculate minimum sample size. We compared the reported sample size of each CPM against the traditional 10 EPP rule of thumb and formal sample size criteria. About 211 CPMs were included. Three of the studies justified the sample size used, mostly using EPP rules of thumb. Overall, 69% of the CPMs were derived on sample sizes that surpassed the traditional EPP≥10 rule of thumb, but only 48% surpassed recent formal sample size criteria. For most CPMs, the required sample size based on formal criteria was higher than the sample sizes to surpass 10 EPP. Few of the CPMs included in this study justified their sample size, with most justifications being based on EPP. This study shows that, in real-world data sets, adhering to the classic EPP rules of thumb is insufficient to adhere to recent formal sample size criteria. •Sample Size justification of 211 CPMs related to prostate cancer was lacking.•An EPP≥10 is not necessarily sufficient to guide sample size for CPM development.•Sample sizes for CPM development should follow formal sample size criteria.•We recommend sample size justification in all future prediction model studies.
doi_str_mv 10.1016/j.jclinepi.2020.12.011
format article
fullrecord <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_2474501432</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0895435620312154</els_id><sourcerecordid>2474501432</sourcerecordid><originalsourceid>FETCH-LOGICAL-c444t-a2f603f03ff804a9781790a9071da8977217d7e838ac928bc47270b3c21a764f3</originalsourceid><addsrcrecordid>eNqFkU9rHSEUxaW0NC9pv0IQuulmXvw3o-5SQtoEAlm0XYtPr9RhxnlRpyX99PHxki66CQiC93fuvZ6D0DklW0rocDFuRzfFBPu4ZYS1R7YllL5BG6qk6nrN6Fu0IUr3neD9cIJOSxkJoZLI_j064ZwrTpnaoF_f7byfAJf4FwpeAt5n8NHVuCQ8Lx4mXOrqY6vF1GpLqbYCdjY5yPgPZMDZZpge8biWGkMEj23yrVGF1CRlDSG6CKl-QO-CnQp8fL7P0M-v1z-ubrq7-2-3V1_uOieEqJ1lYSA8tBMUEVZLRaUmVhNJvVVaSkall6C4sk4ztXNCMkl23DFq5SACP0Ofj33bsg8rlGrmWBxMk02wrMUwIUVPqOCsoZ_-Q8dlzaltZ1jPtGK9FLpRw5Fy7fclQzD7HGebHw0l5pCFGc1LFuaQhaHMtCya8Py5_bqbwf-TvZjfgMsjAM2P3xGyKQerXAsgg6vGL_G1GU_PYZ39</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2529825749</pqid></control><display><type>article</type><title>Sample sizes of prediction model studies in prostate cancer were rarely justified and often insufficient</title><source>Elsevier</source><creator>Collins, Shane D. ; Peek, Niels ; Riley, Richard D. ; Martin, Glen P.</creator><creatorcontrib>Collins, Shane D. ; Peek, Niels ; Riley, Richard D. ; Martin, Glen P.</creatorcontrib><description>Developing clinical prediction models (CPMs) on data of sufficient sample size is critical to help minimize overfitting. Using prostate cancer as a clinical exemplar, we aimed to investigate to what extent existing CPMs adhere to recent formal sample size criteria, or historic rules of thumb of events per predictor parameter (EPP)≥10. A systematic review to identify CPMs related to prostate cancer, which provided enough information to calculate minimum sample size. We compared the reported sample size of each CPM against the traditional 10 EPP rule of thumb and formal sample size criteria. About 211 CPMs were included. Three of the studies justified the sample size used, mostly using EPP rules of thumb. Overall, 69% of the CPMs were derived on sample sizes that surpassed the traditional EPP≥10 rule of thumb, but only 48% surpassed recent formal sample size criteria. For most CPMs, the required sample size based on formal criteria was higher than the sample sizes to surpass 10 EPP. Few of the CPMs included in this study justified their sample size, with most justifications being based on EPP. This study shows that, in real-world data sets, adhering to the classic EPP rules of thumb is insufficient to adhere to recent formal sample size criteria. •Sample Size justification of 211 CPMs related to prostate cancer was lacking.•An EPP≥10 is not necessarily sufficient to guide sample size for CPM development.•Sample sizes for CPM development should follow formal sample size criteria.•We recommend sample size justification in all future prediction model studies.</description><identifier>ISSN: 0895-4356</identifier><identifier>EISSN: 1878-5921</identifier><identifier>DOI: 10.1016/j.jclinepi.2020.12.011</identifier><identifier>PMID: 33383128</identifier><language>eng</language><publisher>United States: Elsevier Inc</publisher><subject>Criteria ; Design parameters ; Development ; Epidemiology ; Medical prognosis ; Parameter identification ; Prediction models ; Prostate cancer ; Sample size ; Systematic review ; Validation</subject><ispartof>Journal of clinical epidemiology, 2021-05, Vol.133, p.53-60</ispartof><rights>2020 The Author(s)</rights><rights>Copyright © 2020 The Author(s). Published by Elsevier Inc. All rights reserved.</rights><rights>2020. The Author(s)</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c444t-a2f603f03ff804a9781790a9071da8977217d7e838ac928bc47270b3c21a764f3</citedby><cites>FETCH-LOGICAL-c444t-a2f603f03ff804a9781790a9071da8977217d7e838ac928bc47270b3c21a764f3</cites><orcidid>0000-0002-6393-9969 ; 0000-0002-3410-9472 ; 0000-0002-0950-741X</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/33383128$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Collins, Shane D.</creatorcontrib><creatorcontrib>Peek, Niels</creatorcontrib><creatorcontrib>Riley, Richard D.</creatorcontrib><creatorcontrib>Martin, Glen P.</creatorcontrib><title>Sample sizes of prediction model studies in prostate cancer were rarely justified and often insufficient</title><title>Journal of clinical epidemiology</title><addtitle>J Clin Epidemiol</addtitle><description>Developing clinical prediction models (CPMs) on data of sufficient sample size is critical to help minimize overfitting. Using prostate cancer as a clinical exemplar, we aimed to investigate to what extent existing CPMs adhere to recent formal sample size criteria, or historic rules of thumb of events per predictor parameter (EPP)≥10. A systematic review to identify CPMs related to prostate cancer, which provided enough information to calculate minimum sample size. We compared the reported sample size of each CPM against the traditional 10 EPP rule of thumb and formal sample size criteria. About 211 CPMs were included. Three of the studies justified the sample size used, mostly using EPP rules of thumb. Overall, 69% of the CPMs were derived on sample sizes that surpassed the traditional EPP≥10 rule of thumb, but only 48% surpassed recent formal sample size criteria. For most CPMs, the required sample size based on formal criteria was higher than the sample sizes to surpass 10 EPP. Few of the CPMs included in this study justified their sample size, with most justifications being based on EPP. This study shows that, in real-world data sets, adhering to the classic EPP rules of thumb is insufficient to adhere to recent formal sample size criteria. •Sample Size justification of 211 CPMs related to prostate cancer was lacking.•An EPP≥10 is not necessarily sufficient to guide sample size for CPM development.•Sample sizes for CPM development should follow formal sample size criteria.•We recommend sample size justification in all future prediction model studies.</description><subject>Criteria</subject><subject>Design parameters</subject><subject>Development</subject><subject>Epidemiology</subject><subject>Medical prognosis</subject><subject>Parameter identification</subject><subject>Prediction models</subject><subject>Prostate cancer</subject><subject>Sample size</subject><subject>Systematic review</subject><subject>Validation</subject><issn>0895-4356</issn><issn>1878-5921</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNqFkU9rHSEUxaW0NC9pv0IQuulmXvw3o-5SQtoEAlm0XYtPr9RhxnlRpyX99PHxki66CQiC93fuvZ6D0DklW0rocDFuRzfFBPu4ZYS1R7YllL5BG6qk6nrN6Fu0IUr3neD9cIJOSxkJoZLI_j064ZwrTpnaoF_f7byfAJf4FwpeAt5n8NHVuCQ8Lx4mXOrqY6vF1GpLqbYCdjY5yPgPZMDZZpge8biWGkMEj23yrVGF1CRlDSG6CKl-QO-CnQp8fL7P0M-v1z-ubrq7-2-3V1_uOieEqJ1lYSA8tBMUEVZLRaUmVhNJvVVaSkall6C4sk4ztXNCMkl23DFq5SACP0Ofj33bsg8rlGrmWBxMk02wrMUwIUVPqOCsoZ_-Q8dlzaltZ1jPtGK9FLpRw5Fy7fclQzD7HGebHw0l5pCFGc1LFuaQhaHMtCya8Py5_bqbwf-TvZjfgMsjAM2P3xGyKQerXAsgg6vGL_G1GU_PYZ39</recordid><startdate>20210501</startdate><enddate>20210501</enddate><creator>Collins, Shane D.</creator><creator>Peek, Niels</creator><creator>Riley, Richard D.</creator><creator>Martin, Glen P.</creator><general>Elsevier Inc</general><general>Elsevier Limited</general><scope>6I.</scope><scope>AAFTH</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7QL</scope><scope>7QP</scope><scope>7RV</scope><scope>7T2</scope><scope>7T7</scope><scope>7TK</scope><scope>7U7</scope><scope>7U9</scope><scope>7X7</scope><scope>7XB</scope><scope>88C</scope><scope>88E</scope><scope>8AO</scope><scope>8C1</scope><scope>8FD</scope><scope>8FI</scope><scope>8FJ</scope><scope>8FK</scope><scope>8G5</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>C1K</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FR3</scope><scope>FYUFA</scope><scope>GHDGH</scope><scope>GNUQQ</scope><scope>GUQSH</scope><scope>H94</scope><scope>K9.</scope><scope>KB0</scope><scope>M0S</scope><scope>M0T</scope><scope>M1P</scope><scope>M2O</scope><scope>M7N</scope><scope>MBDVC</scope><scope>NAPCQ</scope><scope>P64</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>Q9U</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0002-6393-9969</orcidid><orcidid>https://orcid.org/0000-0002-3410-9472</orcidid><orcidid>https://orcid.org/0000-0002-0950-741X</orcidid></search><sort><creationdate>20210501</creationdate><title>Sample sizes of prediction model studies in prostate cancer were rarely justified and often insufficient</title><author>Collins, Shane D. ; Peek, Niels ; Riley, Richard D. ; Martin, Glen P.</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c444t-a2f603f03ff804a9781790a9071da8977217d7e838ac928bc47270b3c21a764f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><topic>Criteria</topic><topic>Design parameters</topic><topic>Development</topic><topic>Epidemiology</topic><topic>Medical prognosis</topic><topic>Parameter identification</topic><topic>Prediction models</topic><topic>Prostate cancer</topic><topic>Sample size</topic><topic>Systematic review</topic><topic>Validation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Collins, Shane D.</creatorcontrib><creatorcontrib>Peek, Niels</creatorcontrib><creatorcontrib>Riley, Richard D.</creatorcontrib><creatorcontrib>Martin, Glen P.</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Bacteriology Abstracts (Microbiology B)</collection><collection>Calcium &amp; Calcified Tissue Abstracts</collection><collection>Nursing &amp; Allied Health Database</collection><collection>Health and Safety Science Abstracts (Full archive)</collection><collection>Industrial and Applied Microbiology Abstracts (Microbiology A)</collection><collection>Neurosciences Abstracts</collection><collection>Toxicology Abstracts</collection><collection>Virology and AIDS Abstracts</collection><collection>ProQuest - Health &amp; Medical Complete保健、医学与药学数据库</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>Healthcare Administration Database (Alumni)</collection><collection>Medical Database (Alumni Edition)</collection><collection>ProQuest Pharma Collection</collection><collection>Public Health Database</collection><collection>Technology Research Database</collection><collection>Hospital Premium Collection</collection><collection>Hospital Premium Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>Research Library (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Environmental Sciences and Pollution Management</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Engineering Research Database</collection><collection>Health Research Premium Collection</collection><collection>Health Research Premium Collection (Alumni)</collection><collection>ProQuest Central Student</collection><collection>Research Library Prep</collection><collection>AIDS and Cancer Research Abstracts</collection><collection>ProQuest Health &amp; Medical Complete (Alumni)</collection><collection>Nursing &amp; Allied Health Database (Alumni Edition)</collection><collection>Health &amp; Medical Collection (Alumni Edition)</collection><collection>ProQuest Healthcare Administration Database</collection><collection>PML(ProQuest Medical Library)</collection><collection>ProQuest research library</collection><collection>Algology Mycology and Protozoology Abstracts (Microbiology C)</collection><collection>Research Library (Corporate)</collection><collection>Nursing &amp; Allied Health Premium</collection><collection>Biotechnology and BioEngineering Abstracts</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central Basic</collection><collection>MEDLINE - Academic</collection><jtitle>Journal of clinical epidemiology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Collins, Shane D.</au><au>Peek, Niels</au><au>Riley, Richard D.</au><au>Martin, Glen P.</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Sample sizes of prediction model studies in prostate cancer were rarely justified and often insufficient</atitle><jtitle>Journal of clinical epidemiology</jtitle><addtitle>J Clin Epidemiol</addtitle><date>2021-05-01</date><risdate>2021</risdate><volume>133</volume><spage>53</spage><epage>60</epage><pages>53-60</pages><issn>0895-4356</issn><eissn>1878-5921</eissn><abstract>Developing clinical prediction models (CPMs) on data of sufficient sample size is critical to help minimize overfitting. Using prostate cancer as a clinical exemplar, we aimed to investigate to what extent existing CPMs adhere to recent formal sample size criteria, or historic rules of thumb of events per predictor parameter (EPP)≥10. A systematic review to identify CPMs related to prostate cancer, which provided enough information to calculate minimum sample size. We compared the reported sample size of each CPM against the traditional 10 EPP rule of thumb and formal sample size criteria. About 211 CPMs were included. Three of the studies justified the sample size used, mostly using EPP rules of thumb. Overall, 69% of the CPMs were derived on sample sizes that surpassed the traditional EPP≥10 rule of thumb, but only 48% surpassed recent formal sample size criteria. For most CPMs, the required sample size based on formal criteria was higher than the sample sizes to surpass 10 EPP. Few of the CPMs included in this study justified their sample size, with most justifications being based on EPP. This study shows that, in real-world data sets, adhering to the classic EPP rules of thumb is insufficient to adhere to recent formal sample size criteria. •Sample Size justification of 211 CPMs related to prostate cancer was lacking.•An EPP≥10 is not necessarily sufficient to guide sample size for CPM development.•Sample sizes for CPM development should follow formal sample size criteria.•We recommend sample size justification in all future prediction model studies.</abstract><cop>United States</cop><pub>Elsevier Inc</pub><pmid>33383128</pmid><doi>10.1016/j.jclinepi.2020.12.011</doi><tpages>8</tpages><orcidid>https://orcid.org/0000-0002-6393-9969</orcidid><orcidid>https://orcid.org/0000-0002-3410-9472</orcidid><orcidid>https://orcid.org/0000-0002-0950-741X</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 0895-4356
ispartof Journal of clinical epidemiology, 2021-05, Vol.133, p.53-60
issn 0895-4356
1878-5921
language eng
recordid cdi_proquest_miscellaneous_2474501432
source Elsevier
subjects Criteria
Design parameters
Development
Epidemiology
Medical prognosis
Parameter identification
Prediction models
Prostate cancer
Sample size
Systematic review
Validation
title Sample sizes of prediction model studies in prostate cancer were rarely justified and often insufficient
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-03T09%3A02%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Sample%20sizes%20of%20prediction%20model%20studies%20in%20prostate%20cancer%20were%20rarely%20justified%20and%20often%20insufficient&rft.jtitle=Journal%20of%20clinical%20epidemiology&rft.au=Collins,%20Shane%20D.&rft.date=2021-05-01&rft.volume=133&rft.spage=53&rft.epage=60&rft.pages=53-60&rft.issn=0895-4356&rft.eissn=1878-5921&rft_id=info:doi/10.1016/j.jclinepi.2020.12.011&rft_dat=%3Cproquest_cross%3E2474501432%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c444t-a2f603f03ff804a9781790a9071da8977217d7e838ac928bc47270b3c21a764f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2529825749&rft_id=info:pmid/33383128&rfr_iscdi=true