Loading…

Distributed Online Stochastic-Constrained Convex Optimization With Bandit Feedback

This article studies the distributed online stochastic convex optimization problem with the time-varying constraint over a multiagent system constructed by various agents. The sequences of cost functions and constraint functions, both of which have dynamic parameters following time-varying distribut...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on cybernetics 2024-01, Vol.54 (1), p.63-75
Main Authors:	Wang, Cong, Xu, Shengyuan, Yuan, Deming
Format:	Article
Language:	English
Subjects:	Algorithms bandit feedback Constraints Convex analysis Convex functions Convexity Cost function Costs Distributed online stochastic optimization Feedback Heuristic algorithms Iterative methods Multi-agent systems Multiagent systems Optimization Regularization Sequences time-varying constraint Time-varying systems
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This article studies the distributed online stochastic convex optimization problem with the time-varying constraint over a multiagent system constructed by various agents. The sequences of cost functions and constraint functions, both of which have dynamic parameters following time-varying distributions, are unacquainted to the agent ahead of time. Agents in the network are able to interact with their neighbors through a sequence of strongly connected and time-varying graphs. We develop the adaptive distributed bandit primal-dual algorithm whose step size and regularization sequences are adaptive and have no prior knowledge about the total iteration span T . The adaptive distributed bandit primal-dual algorithm applies bandit feedback with a one-point or two-point gradient estimator to evaluate gradient values. It is illustrated in this article that if the drift of the benchmark sequence is sublinear, then the adaptive distributed bandit primal-dual algorithm exhibits sublinear expected dynamic regret and constraint violation using both two kinds of gradient estimator to compute gradient information. We present a numerical experiment to show the performance of the proposed method.
ISSN:	2168-2267 2168-2275
DOI:	10.1109/TCYB.2022.3177644