Loading…

Variable selection methods were poorly reported but rarely misused in major medical journals: Literature review

•In the “big five”.•Reporting about variable selection methods is insufficient.•Data-driven methods are not commonly used in causal explanatory models.•The addition of an adjustment variable is common in sensitivity analyses. Objective This work presents a review of the literature on reporting, prac...

Full description

Saved in:
Bibliographic Details
Published in:Journal of clinical epidemiology 2021-11, Vol.139, p.12-19
Main Authors: Pressat-Laffouilhère, T., Jouffroy, R., Leguillou, A., Kerdelhue, G., Benichou, J., Gillibert, A.
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•In the “big five”.•Reporting about variable selection methods is insufficient.•Data-driven methods are not commonly used in causal explanatory models.•The addition of an adjustment variable is common in sensitivity analyses. Objective This work presents a review of the literature on reporting, practice and misuse of knowledge-based and data-driven variable selection methods, in five highly cited medical journals, considering recoding and interaction unlike previous reviews. Study Design and Setting Original observational studies with a predictive or explicative research question with multivariable analyses published in N. Engl. J. Med., Lancet, JAMA, Br. Med. J. and Ann. Intern. Med. between 2017 and 2019 were searched. Article screening was performed by a single reader, data extraction was performed by two readers and a third reader participated in case of disagreement. The use of data-driven variable selection methods in causal explicative questions was considered as misuse. Results 488 articles were included. The variable selection method was unclear in 234 (48%) articles, data-driven in 78 (16%) articles and knowledge-based in 176 (36%) articles. The most common data-driven methods were: Univariate selection (n = 22, 4.5%) and model comparisons or testing for interaction (n = 17, 3.5%). Data-driven methods were misused in 51 (10.5%) of articles. Conclusion Overall reporting of variable selection methods is insufficient. Data-driven methods seem to be used only in a minority of articles of the big five medical journals.
ISSN:0895-4356
1878-5921
DOI:10.1016/j.jclinepi.2021.07.006