Loading…

Reward Attack on Stochastic Bandits with Non-Stationary Rewards

In this paper, we investigate rewards attacks on stochastic multi-armed bandit algorithms with non-stationary environment. The attacker's goal is to force the victim algorithm to choose a suboptimal arm most of the time while incurring a small attack cost. Three main attack scenarios are consid...

Full description

Saved in:
Bibliographic Details
Main Authors: Yang, Chenye, Liu, Guanlin, Lai, Lifeng
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we investigate rewards attacks on stochastic multi-armed bandit algorithms with non-stationary environment. The attacker's goal is to force the victim algorithm to choose a suboptimal arm most of the time while incurring a small attack cost. Three main attack scenarios are considered: easy attack scenario, general attack scenario, and general attack scenario with limited information of victim algorithm. These scenarios have different assumptions about the environment and accessible information. We propose three attack strategies, one for each considered scenario, and prove that they are successful in terms of expected target arm selection and attack cost. The simulation results validate our theoretical analysis.
ISSN:2576-2303
DOI:10.1109/IEEECONF59524.2023.10476992