Loading…

On the asymptotic optimality of greedy index heuristics for multi-action restless bandits

The class of restless bandits as proposed by Whittle (1988) have long been known to be intractable. This paper presents an optimality result which extends that of Weber and Weiss (1990) for restless bandits to a more general setting in which individual bandits have multiple levels of activation but...

Full description

Saved in:

Bibliographic Details
Published in:	Advances in applied probability 2015-09, Vol.47 (3), p.652-667
Main Authors:	Hodge, D. J., Glazebrook, K. D.
Format:	Article
Language:	English
Subjects:	49L20 49M20 90C40 93E20 Activation Asymptotic methods asymptotic optimality Asymptotic properties Constraints Differential equations General Applied Probability Heuristic Index heuristic Lagrange multiplier multi-action restless bandit Optimization Performance indices Probability Resource allocation stochastic resource allocation Studies
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The class of restless bandits as proposed by Whittle (1988) have long been known to be intractable. This paper presents an optimality result which extends that of Weber and Weiss (1990) for restless bandits to a more general setting in which individual bandits have multiple levels of activation but are subject to an overall resource constraint. The contribution is motivated by the recent works of Glazebrook et al. (2011a), (2011b) who discussed the performance of index heuristics for resource allocation in such systems. Hitherto, index heuristics have been shown, under a condition of full indexability, to be optimal for a natural Lagrangian relaxation of such problems in which a resource is purchased rather than constrained. We find that under key assumptions about the nature of solutions to a deterministic differential equation that the index heuristics above are asymptotically optimal in a sense described by Whittle. We then demonstrate that these assumptions always hold for three-state bandits.
ISSN:	0001-8678 1475-6064
DOI:	10.1239/aap/1444308876