Loading…

Reward Attack on Stochastic Bandits with Non-Stationary Rewards

In this paper, we investigate rewards attacks on stochastic multi-armed bandit algorithms with non-stationary environment. The attacker's goal is to force the victim algorithm to choose a suboptimal arm most of the time while incurring a small attack cost. Three main attack scenarios are consid...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yang, Chenye, Liu, Guanlin, Lai, Lifeng
Format:	Conference Proceeding
Language:	English
Subjects:	attack cost bandit Computers Costs Force Multi-armed bandit problem non-stationary reward Simulation Stochastic processes Uncertainty
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page	1393
container_issue
container_start_page	1387
container_title
container_volume
creator	Yang, Chenye Liu, Guanlin Lai, Lifeng
description	In this paper, we investigate rewards attacks on stochastic multi-armed bandit algorithms with non-stationary environment. The attacker's goal is to force the victim algorithm to choose a suboptimal arm most of the time while incurring a small attack cost. Three main attack scenarios are considered: easy attack scenario, general attack scenario, and general attack scenario with limited information of victim algorithm. These scenarios have different assumptions about the environment and accessible information. We propose three attack strategies, one for each considered scenario, and prove that they are successful in terms of expected target arm selection and attack cost. The simulation results validate our theoretical analysis.
doi_str_mv	10.1109/IEEECONF59524.2023.10476992
format	conference_proceeding
fullrecord	<record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10476992</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10476992</ieee_id><sourcerecordid>10476992</sourcerecordid><originalsourceid>FETCH-LOGICAL-i204t-46072a71cb1a6ca7722bcc8f3e53345a44fe928c9fba733d3c4722e322792db53</originalsourceid><addsrcrecordid>eNo1j11LwzAYhaMgOOf-gRcBrzuT902a5kpm6XQwNnB6Pd6mKYsfrTSB4b9fYXr13JzncA5j91LMpRT2YVVVVbndLLXVoOYgAOdSKJNbCxdsZo0tUAsEbZS6ZJOReQYo8JrdxPghxCgUMGGPr_5IQ8MXKZH75H3Hd6l3B4opOP5EXRNS5MeQDnzTd9kuUQp9R8MvP3vxll219BX97I9T9r6s3sqXbL19XpWLdRZAqJSpXBggI10tKXdkDEDtXNGi14hKk1Ktt1A429ZkEBt0aox4BDAWmlrjlN2de4P3fv8zhO9xxP7_MJ4AEq9LLw</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Reward Attack on Stochastic Bandits with Non-Stationary Rewards</title><source>IEEE Xplore All Conference Series</source><creator>Yang, Chenye ; Liu, Guanlin ; Lai, Lifeng</creator><creatorcontrib>Yang, Chenye ; Liu, Guanlin ; Lai, Lifeng</creatorcontrib><description>In this paper, we investigate rewards attacks on stochastic multi-armed bandit algorithms with non-stationary environment. The attacker's goal is to force the victim algorithm to choose a suboptimal arm most of the time while incurring a small attack cost. Three main attack scenarios are considered: easy attack scenario, general attack scenario, and general attack scenario with limited information of victim algorithm. These scenarios have different assumptions about the environment and accessible information. We propose three attack strategies, one for each considered scenario, and prove that they are successful in terms of expected target arm selection and attack cost. The simulation results validate our theoretical analysis.</description><identifier>EISSN: 2576-2303</identifier><identifier>EISBN: 9798350325744</identifier><identifier>DOI: 10.1109/IEEECONF59524.2023.10476992</identifier><language>eng</language><publisher>IEEE</publisher><subject>attack cost ; bandit ; Computers ; Costs ; Force ; Multi-armed bandit problem ; non-stationary reward ; Simulation ; Stochastic processes ; Uncertainty</subject><ispartof>2023 57th Asilomar Conference on Signals, Systems, and Computers, 2023, p.1387-1393</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10476992$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,780,784,789,790,27925,54555,54932</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10476992$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Yang, Chenye</creatorcontrib><creatorcontrib>Liu, Guanlin</creatorcontrib><creatorcontrib>Lai, Lifeng</creatorcontrib><title>Reward Attack on Stochastic Bandits with Non-Stationary Rewards</title><title>2023 57th Asilomar Conference on Signals, Systems, and Computers</title><addtitle>IEEECONF</addtitle><description>In this paper, we investigate rewards attacks on stochastic multi-armed bandit algorithms with non-stationary environment. The attacker's goal is to force the victim algorithm to choose a suboptimal arm most of the time while incurring a small attack cost. Three main attack scenarios are considered: easy attack scenario, general attack scenario, and general attack scenario with limited information of victim algorithm. These scenarios have different assumptions about the environment and accessible information. We propose three attack strategies, one for each considered scenario, and prove that they are successful in terms of expected target arm selection and attack cost. The simulation results validate our theoretical analysis.</description><subject>attack cost</subject><subject>bandit</subject><subject>Computers</subject><subject>Costs</subject><subject>Force</subject><subject>Multi-armed bandit problem</subject><subject>non-stationary reward</subject><subject>Simulation</subject><subject>Stochastic processes</subject><subject>Uncertainty</subject><issn>2576-2303</issn><isbn>9798350325744</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2023</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNo1j11LwzAYhaMgOOf-gRcBrzuT902a5kpm6XQwNnB6Pd6mKYsfrTSB4b9fYXr13JzncA5j91LMpRT2YVVVVbndLLXVoOYgAOdSKJNbCxdsZo0tUAsEbZS6ZJOReQYo8JrdxPghxCgUMGGPr_5IQ8MXKZH75H3Hd6l3B4opOP5EXRNS5MeQDnzTd9kuUQp9R8MvP3vxll219BX97I9T9r6s3sqXbL19XpWLdRZAqJSpXBggI10tKXdkDEDtXNGi14hKk1Ktt1A429ZkEBt0aox4BDAWmlrjlN2de4P3fv8zhO9xxP7_MJ4AEq9LLw</recordid><startdate>20231029</startdate><enddate>20231029</enddate><creator>Yang, Chenye</creator><creator>Liu, Guanlin</creator><creator>Lai, Lifeng</creator><general>IEEE</general><scope>6IE</scope><scope>6IH</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIO</scope></search><sort><creationdate>20231029</creationdate><title>Reward Attack on Stochastic Bandits with Non-Stationary Rewards</title><author>Yang, Chenye ; Liu, Guanlin ; Lai, Lifeng</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i204t-46072a71cb1a6ca7722bcc8f3e53345a44fe928c9fba733d3c4722e322792db53</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2023</creationdate><topic>attack cost</topic><topic>bandit</topic><topic>Computers</topic><topic>Costs</topic><topic>Force</topic><topic>Multi-armed bandit problem</topic><topic>non-stationary reward</topic><topic>Simulation</topic><topic>Stochastic processes</topic><topic>Uncertainty</topic><toplevel>online_resources</toplevel><creatorcontrib>Yang, Chenye</creatorcontrib><creatorcontrib>Liu, Guanlin</creatorcontrib><creatorcontrib>Lai, Lifeng</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan (POP) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP) 1998-present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Yang, Chenye</au><au>Liu, Guanlin</au><au>Lai, Lifeng</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Reward Attack on Stochastic Bandits with Non-Stationary Rewards</atitle><btitle>2023 57th Asilomar Conference on Signals, Systems, and Computers</btitle><stitle>IEEECONF</stitle><date>2023-10-29</date><risdate>2023</risdate><spage>1387</spage><epage>1393</epage><pages>1387-1393</pages><eissn>2576-2303</eissn><eisbn>9798350325744</eisbn><abstract>In this paper, we investigate rewards attacks on stochastic multi-armed bandit algorithms with non-stationary environment. The attacker's goal is to force the victim algorithm to choose a suboptimal arm most of the time while incurring a small attack cost. Three main attack scenarios are considered: easy attack scenario, general attack scenario, and general attack scenario with limited information of victim algorithm. These scenarios have different assumptions about the environment and accessible information. We propose three attack strategies, one for each considered scenario, and prove that they are successful in terms of expected target arm selection and attack cost. The simulation results validate our theoretical analysis.</abstract><pub>IEEE</pub><doi>10.1109/IEEECONF59524.2023.10476992</doi><tpages>7</tpages></addata></record>
fulltext	fulltext_linktorsrc
identifier	EISSN: 2576-2303
ispartof	2023 57th Asilomar Conference on Signals, Systems, and Computers, 2023, p.1387-1393
issn	2576-2303
language	eng
recordid	cdi_ieee_primary_10476992
source	IEEE Xplore All Conference Series
subjects	attack cost bandit Computers Costs Force Multi-armed bandit problem non-stationary reward Simulation Stochastic processes Uncertainty
title	Reward Attack on Stochastic Bandits with Non-Stationary Rewards
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-02T03%3A03%3A39IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Reward%20Attack%20on%20Stochastic%20Bandits%20with%20Non-Stationary%20Rewards&rft.btitle=2023%2057th%20Asilomar%20Conference%20on%20Signals,%20Systems,%20and%20Computers&rft.au=Yang,%20Chenye&rft.date=2023-10-29&rft.spage=1387&rft.epage=1393&rft.pages=1387-1393&rft.eissn=2576-2303&rft_id=info:doi/10.1109/IEEECONF59524.2023.10476992&rft.eisbn=9798350325744&rft_dat=%3Cieee_CHZPO%3E10476992%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i204t-46072a71cb1a6ca7722bcc8f3e53345a44fe928c9fba733d3c4722e322792db53%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10476992&rfr_iscdi=true