Loading…

PAC Mode Estimation using PPR Martingale Confidence Sequences

We consider the problem of correctly identifying the \textit{mode} of a discrete distribution \(\mathcal{P}\) with sufficiently high probability by observing a sequence of i.i.d. samples drawn from \(\mathcal{P}\). This problem reduces to the estimation of a single parameter when \(\mathcal{P}\) has...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2022-04
Main Authors: Shubham Anand Jain, Shah, Rohan, Gupta, Sanit, Mehta, Denil, Inderjeet Jayakumar Nair, Vora, Jian, Khyalia, Sushil, Das, Sourav, Ribeiro, Vinay J, Kalyanakrishnan, Shivaram
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We consider the problem of correctly identifying the \textit{mode} of a discrete distribution \(\mathcal{P}\) with sufficiently high probability by observing a sequence of i.i.d. samples drawn from \(\mathcal{P}\). This problem reduces to the estimation of a single parameter when \(\mathcal{P}\) has a support set of size \(K = 2\). After noting that this special case is tackled very well by prior-posterior-ratio (PPR) martingale confidence sequences \citep{waudby-ramdas-ppr}, we propose a generalisation to mode estimation, in which \(\mathcal{P}\) may take \(K \geq 2\) values. To begin, we show that the "one-versus-one" principle to generalise from \(K = 2\) to \(K \geq 2\) classes is more efficient than the "one-versus-rest" alternative. We then prove that our resulting stopping rule, denoted PPR-1v1, is asymptotically optimal (as the mistake probability is taken to \(0\)). PPR-1v1 is parameter-free and computationally light, and incurs significantly fewer samples than competitors even in the non-asymptotic regime. We demonstrate its gains in two practical applications of sampling: election forecasting and verification of smart contracts in blockchains.
ISSN:2331-8422