Loading…
Decisions About Equivalence: A Comparison of TOST, HDI-ROPE, and the Bayes Factor
Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illu...
Saved in:
Published in: | Psychological methods 2023-06, Vol.28 (3), p.740-755 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Some important research questions require the ability to find evidence for two conditions being practically equivalent. This is impossible to accomplish within the traditional frequentist null hypothesis significance testing framework; hence, other methodologies must be utilized. We explain and illustrate three approaches for finding evidence for equivalence: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. We compare the classification performances of these three approaches for various plausible scenarios. The results indicate that the Bayes factor interval null approach compares favorably to the other two approaches in terms of statistical power. Critically, compared with the Bayes factor interval null procedure, the two one-sided tests and the highest density interval region of practical equivalence procedures have limited discrimination capabilities when the sample size is relatively small: Specifically, in order to be practically useful, these two methods generally require over 250 cases within each condition when rather large equivalence margins of approximately .2 or .3 are used; for smaller equivalence margins even more cases are required. Because of these results, we recommend that researchers rely more on the Bayes factor interval null approach for quantifying evidence for equivalence, especially for studies that are constrained on sample size.
Translational AbstractIn many areas of research, it is important to be able to quantify evidence that two groups are practically equivalent on some measure. We explain and illustrate three special statistical procedures that allow finding evidence that two groups are practically equivalent: The frequentist two one-sided tests procedure, the Bayesian highest density interval region of practical equivalence procedure, and the Bayes factor interval null procedure. In a simulation study, we compare the three procedures in terms of their classification performance: How often does each method conclude equivalence when the groups are equivalent, and how often does each method conclude equivalence when the groups are not equivalent? The results indicate that the Bayes factor interval null procedure is better at discriminating between equivalence and nonequivalence than the other two procedures. This advantage is particularly noticeable for relatively small sample sizes and relatively narrow |
---|---|
ISSN: | 1082-989X 1939-1463 |
DOI: | 10.1037/met0000402 |