Loading…

Interior-Point Methods for Full-Information and Bandit Online Learning

We study the problem of predicting individual sequences with linear loss with full and partial (or bandit) feed- back. Our main contribution is the first efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal Õ(√(T)) regret. In addition, f...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on information theory 2012-07, Vol.58 (7), p.4164-4175
Main Authors: Abernethy, Jacob D., Hazan, Elad, Rakhlin, Alexander
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We study the problem of predicting individual sequences with linear loss with full and partial (or bandit) feed- back. Our main contribution is the first efficient algorithm for the problem of online linear optimization in the bandit setting which achieves the optimal Õ(√(T)) regret. In addition, for the full-information setting, we give a novel regret minimization algorithm. These results are made possible by the introduction of interior-point methods for convex optimization to online learning.
ISSN:0018-9448
1557-9654
DOI:10.1109/TIT.2012.2192096