Loading…

YEAST: Yet Another Sequential Test

Large-scale randomised experiments have become a standard tool for developing products and improving user experience. To reduce losses from shipping harmful changes experimental results are, in practice, often checked repeatedly, which leads to inflated false alarm rates. To alleviate this problem,...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-06
Main Authors: Kurennoy, Alexey, Dodin, Majed, Gurbanov, Tural, Ramallo, Ana Peleteiro
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Large-scale randomised experiments have become a standard tool for developing products and improving user experience. To reduce losses from shipping harmful changes experimental results are, in practice, often checked repeatedly, which leads to inflated false alarm rates. To alleviate this problem, one can use sequential testing techniques as they control false discovery rates despite repeated checks. While multiple sequential testing methods exist in the literature, they either restrict the number of interim checks the experimenter can perform or have tuning parameters that require calibration. In this paper, we propose a novel sequential testing method that does not limit the number of interim checks and at the same time does not have any tuning parameters. The proposed method is new and does not stem from existing experiment monitoring procedures. It controls false discovery rates by ``inverting'' a bound on the threshold crossing probability derived from a classical maximal inequality. We demonstrate both in simulations and using real-world data that the proposed method outperforms current state-of-the-art sequential tests for continuous test monitoring. In addition, we illustrate the method's effectiveness with a real-world application on a major online fashion platform.
ISSN:2331-8422