Loading…
Byzantine-Resilient Decentralized Policy Evaluation With Linear Function Approximation
In this paper, we consider the policy evaluation problem in reinforcement learning with agents on a decentralized and directed network. In order to evaluate the quality of a fixed policy in this decentralized setting, one option is for agents to run decentralized temporal-difference (TD) collaborati...
Saved in:
Published in: | IEEE transactions on signal processing 2021, Vol.69, p.3839-3853 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, we consider the policy evaluation problem in reinforcement learning with agents on a decentralized and directed network. In order to evaluate the quality of a fixed policy in this decentralized setting, one option is for agents to run decentralized temporal-difference (TD) collaboratively. To account for the practical scenarios where the state and action spaces are large and malicious attacks emerge, we focus on the decentralized TD learning with linear function approximation in the presence of malicious agents (often termed as Byzantine agents). We propose a trimmed mean-based Byzantine-resilient decentralized TD algorithm to perform policy evaluation in this setting. We establish the finite-time convergence rate, as well as the asymptotic learning error that depends on the number of Byzantine agents. Numerical experiments corroborate the robustness of the proposed algorithm. |
---|---|
ISSN: | 1053-587X 1941-0476 |
DOI: | 10.1109/TSP.2021.3090952 |