Loading…

A KL-LUCB Bandit Algorithm for Large-Scale Crowdsourcing

This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bo...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2017-09
Main Authors:	Mankoff, Bob, Nowak, Robert, Ervin Tanczos
Format:	Article
Language:	English
Subjects:	Algorithms Crowdsourcing
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper focuses on best-arm identification in multi-armed bandits with bounded rewards. We develop an algorithm that is a fusion of lil-UCB and KL-LUCB, offering the best qualities of the two algorithms in one method. This is achieved by proving a novel anytime confidence bound for the mean of bounded distributions, which is the analogue of the LIL-type bounds recently developed for sub-Gaussian distributions. We corroborate our theoretical results with numerical experiments based on the New Yorker Cartoon Caption Contest.
ISSN:	2331-8422