Loading…

Decisions About Equivalence: A Comparison of TOST, HDI-ROPE, and the Bayes Factor

Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illu...

Full description

Saved in:
Bibliographic Details
Published in:Psychological methods 2023-06, Vol.28 (3), p.740-755
Main Authors: Linde, Maximilian, Tendeiro, Jorge N., Selker, Ravi, Wagenmakers, Eric-Jan, van Ravenzwaaij, Don
Format: Article
Language:English
Subjects:
Citations: Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size. Translational AbstractIn many areas of research, it is important to be able to quantify evidence that two groups are practically equivalent on some measure. We explain and illustrate three special statistical procedures that allow finding evidence that two groups are practically equivalent: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. In a simulation study, we compare the three procedures in terms of their classification performance: How often does each method conclude equivalence when the groups are equivalent, and how often does each method conclude equivalence when the groups are not equivalent? The results indicate that the Bayes factor interval null procedure is better at discriminating between equivalence and nonequivalence than the other two procedures. This advantage is particularly noticeable for relatively small sample sizes and relatively narrow
ISSN:1082-989X
1939-1463
DOI:10.1037/met0000402