Loading…

A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm

Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce th...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on vehicular technology 2024-10, p.1-14
Main Authors: Zhang, Chudi, Yang, Biao, Wang, Lei, Ji, Wenshuai, Wang, Lulu, Xu, Shiyou
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page 14
container_issue
container_start_page 1
container_title IEEE transactions on vehicular technology
container_volume
creator Zhang, Chudi
Yang, Biao
Wang, Lei
Ji, Wenshuai
Wang, Lulu
Xu, Shiyou
description Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce the A2C algorithm into cognitive jamming decision-making and propose a heuristic improved A2C algorithm. As a fusion algorithm of DQN and Policy Gradient, on the one hand, the Actor-Critic algorithm has the advantages of iterative updating of policies and high efficiency in complex spaces, compared to the DQN algorithm. On the other hand, compared with the Policy Gradient algorithm, it features fast convergence. But, it suffers from high variance, making convergence difficult. First, we establish a cognitive jamming decision-making model to address the above issues. Then, we develop an improved A2C algorithm by introducing a baseline and dueling networks. The baseline reduces the variance of the critic network, while dueling networks further decrease variance, enhancing the convergence of the A2C algorithm. Additionally, the improved A2C algorithm does not rely on prior information, and enhances the adaptive capability of the jammer when interacting with the target radar. We conducted numerical simulations based on the designed cognitive jamming decisionmaking model. The results demonstrated that compared with the four algorithms (DQN, Policy Gradient, Actor-Critic, A2C), the convergence speed of the improved A2C algorithm is improved by 50%, 57.8%, 34.5%, and 13.64%, respectively, and verified the excellent performance of the improved A2C. Finally, we introduce the heuristic reward function and propose the heuristic improved A2C algorithm. Compared with the improved A2C algorithm, the convergence speed of this algorithm is improved by 31.58%. The simulation results prove that the algorithm can greatly improve our advantage in electronic countermeasures
doi_str_mv 10.1109/TVT.2024.3470832
format article
fullrecord <record><control><sourceid>crossref_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10730796</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10730796</ieee_id><sourcerecordid>10_1109_TVT_2024_3470832</sourcerecordid><originalsourceid>FETCH-LOGICAL-c626-b3c31a5c2217e0e93e9c516d78aba490e546813322c997bbfb74dd0e6a7258f93</originalsourceid><addsrcrecordid>eNpNkD1PwzAYhC0EEqWwMzD4DyT4M47HED5a1IolYo0c501raOLKDpX496RqB6bTne5ueBC6pySllOjH6rNKGWEi5UKRnLMLNKOa60RzqS_RjBCaJ1oKeY1uYvyarBCazlBV4NJvBje6A-B30_du2OBnsC46PyRr8330axi3vsVPJkKL_YAX8BNcHJ3Fy34f_GFKC1biYrfxwY3b_hZddWYX4e6sc1S9vlTlIll9vC3LYpXYjGVJwy2nRlrGqAICmoO2kmatyk1jhCYgRZZTzhmzWqum6Rol2pZAZhSTeaf5HJHTrQ0-xgBdvQ-uN-G3pqQ-QqknKPURSn2GMk0eThMHAP_qihOlM_4H_ZddJw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm</title><source>IEEE Xplore (Online service)</source><creator>Zhang, Chudi ; Yang, Biao ; Wang, Lei ; Ji, Wenshuai ; Wang, Lulu ; Xu, Shiyou</creator><creatorcontrib>Zhang, Chudi ; Yang, Biao ; Wang, Lei ; Ji, Wenshuai ; Wang, Lulu ; Xu, Shiyou</creatorcontrib><description>Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce the A2C algorithm into cognitive jamming decision-making and propose a heuristic improved A2C algorithm. As a fusion algorithm of DQN and Policy Gradient, on the one hand, the Actor-Critic algorithm has the advantages of iterative updating of policies and high efficiency in complex spaces, compared to the DQN algorithm. On the other hand, compared with the Policy Gradient algorithm, it features fast convergence. But, it suffers from high variance, making convergence difficult. First, we establish a cognitive jamming decision-making model to address the above issues. Then, we develop an improved A2C algorithm by introducing a baseline and dueling networks. The baseline reduces the variance of the critic network, while dueling networks further decrease variance, enhancing the convergence of the A2C algorithm. Additionally, the improved A2C algorithm does not rely on prior information, and enhances the adaptive capability of the jammer when interacting with the target radar. We conducted numerical simulations based on the designed cognitive jamming decisionmaking model. The results demonstrated that compared with the four algorithms (DQN, Policy Gradient, Actor-Critic, A2C), the convergence speed of the improved A2C algorithm is improved by 50%, 57.8%, 34.5%, and 13.64%, respectively, and verified the excellent performance of the improved A2C. Finally, we introduce the heuristic reward function and propose the heuristic improved A2C algorithm. Compared with the improved A2C algorithm, the convergence speed of this algorithm is improved by 31.58%. The simulation results prove that the algorithm can greatly improve our advantage in electronic countermeasures</description><identifier>ISSN: 0018-9545</identifier><identifier>EISSN: 1939-9359</identifier><identifier>DOI: 10.1109/TVT.2024.3470832</identifier><identifier>CODEN: ITVTAB</identifier><language>eng</language><publisher>IEEE</publisher><subject>Actor-Critic ; Adaptation models ; Airborne radar ; and heuristic improved A2C ; Artificial intelligence ; Cognitive electronic warfare ; Cognitive radar ; Convergence ; Decision making ; Heuristic algorithms ; Jamming ; jamming decisionmaking ; Q-learning ; Radar ; reinforcement learning</subject><ispartof>IEEE transactions on vehicular technology, 2024-10, p.1-14</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10730796$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Zhang, Chudi</creatorcontrib><creatorcontrib>Yang, Biao</creatorcontrib><creatorcontrib>Wang, Lei</creatorcontrib><creatorcontrib>Ji, Wenshuai</creatorcontrib><creatorcontrib>Wang, Lulu</creatorcontrib><creatorcontrib>Xu, Shiyou</creatorcontrib><title>A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm</title><title>IEEE transactions on vehicular technology</title><addtitle>TVT</addtitle><description>Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce the A2C algorithm into cognitive jamming decision-making and propose a heuristic improved A2C algorithm. As a fusion algorithm of DQN and Policy Gradient, on the one hand, the Actor-Critic algorithm has the advantages of iterative updating of policies and high efficiency in complex spaces, compared to the DQN algorithm. On the other hand, compared with the Policy Gradient algorithm, it features fast convergence. But, it suffers from high variance, making convergence difficult. First, we establish a cognitive jamming decision-making model to address the above issues. Then, we develop an improved A2C algorithm by introducing a baseline and dueling networks. The baseline reduces the variance of the critic network, while dueling networks further decrease variance, enhancing the convergence of the A2C algorithm. Additionally, the improved A2C algorithm does not rely on prior information, and enhances the adaptive capability of the jammer when interacting with the target radar. We conducted numerical simulations based on the designed cognitive jamming decisionmaking model. The results demonstrated that compared with the four algorithms (DQN, Policy Gradient, Actor-Critic, A2C), the convergence speed of the improved A2C algorithm is improved by 50%, 57.8%, 34.5%, and 13.64%, respectively, and verified the excellent performance of the improved A2C. Finally, we introduce the heuristic reward function and propose the heuristic improved A2C algorithm. Compared with the improved A2C algorithm, the convergence speed of this algorithm is improved by 31.58%. The simulation results prove that the algorithm can greatly improve our advantage in electronic countermeasures</description><subject>Actor-Critic</subject><subject>Adaptation models</subject><subject>Airborne radar</subject><subject>and heuristic improved A2C</subject><subject>Artificial intelligence</subject><subject>Cognitive electronic warfare</subject><subject>Cognitive radar</subject><subject>Convergence</subject><subject>Decision making</subject><subject>Heuristic algorithms</subject><subject>Jamming</subject><subject>jamming decisionmaking</subject><subject>Q-learning</subject><subject>Radar</subject><subject>reinforcement learning</subject><issn>0018-9545</issn><issn>1939-9359</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkD1PwzAYhC0EEqWwMzD4DyT4M47HED5a1IolYo0c501raOLKDpX496RqB6bTne5ueBC6pySllOjH6rNKGWEi5UKRnLMLNKOa60RzqS_RjBCaJ1oKeY1uYvyarBCazlBV4NJvBje6A-B30_du2OBnsC46PyRr8330axi3vsVPJkKL_YAX8BNcHJ3Fy34f_GFKC1biYrfxwY3b_hZddWYX4e6sc1S9vlTlIll9vC3LYpXYjGVJwy2nRlrGqAICmoO2kmatyk1jhCYgRZZTzhmzWqum6Rol2pZAZhSTeaf5HJHTrQ0-xgBdvQ-uN-G3pqQ-QqknKPURSn2GMk0eThMHAP_qihOlM_4H_ZddJw</recordid><startdate>20241022</startdate><enddate>20241022</enddate><creator>Zhang, Chudi</creator><creator>Yang, Biao</creator><creator>Wang, Lei</creator><creator>Ji, Wenshuai</creator><creator>Wang, Lulu</creator><creator>Xu, Shiyou</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20241022</creationdate><title>A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm</title><author>Zhang, Chudi ; Yang, Biao ; Wang, Lei ; Ji, Wenshuai ; Wang, Lulu ; Xu, Shiyou</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c626-b3c31a5c2217e0e93e9c516d78aba490e546813322c997bbfb74dd0e6a7258f93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Actor-Critic</topic><topic>Adaptation models</topic><topic>Airborne radar</topic><topic>and heuristic improved A2C</topic><topic>Artificial intelligence</topic><topic>Cognitive electronic warfare</topic><topic>Cognitive radar</topic><topic>Convergence</topic><topic>Decision making</topic><topic>Heuristic algorithms</topic><topic>Jamming</topic><topic>jamming decisionmaking</topic><topic>Q-learning</topic><topic>Radar</topic><topic>reinforcement learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Chudi</creatorcontrib><creatorcontrib>Yang, Biao</creatorcontrib><creatorcontrib>Wang, Lei</creatorcontrib><creatorcontrib>Ji, Wenshuai</creatorcontrib><creatorcontrib>Wang, Lulu</creatorcontrib><creatorcontrib>Xu, Shiyou</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore (Online service)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on vehicular technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Chudi</au><au>Yang, Biao</au><au>Wang, Lei</au><au>Ji, Wenshuai</au><au>Wang, Lulu</au><au>Xu, Shiyou</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm</atitle><jtitle>IEEE transactions on vehicular technology</jtitle><stitle>TVT</stitle><date>2024-10-22</date><risdate>2024</risdate><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>0018-9545</issn><eissn>1939-9359</eissn><coden>ITVTAB</coden><abstract>Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce the A2C algorithm into cognitive jamming decision-making and propose a heuristic improved A2C algorithm. As a fusion algorithm of DQN and Policy Gradient, on the one hand, the Actor-Critic algorithm has the advantages of iterative updating of policies and high efficiency in complex spaces, compared to the DQN algorithm. On the other hand, compared with the Policy Gradient algorithm, it features fast convergence. But, it suffers from high variance, making convergence difficult. First, we establish a cognitive jamming decision-making model to address the above issues. Then, we develop an improved A2C algorithm by introducing a baseline and dueling networks. The baseline reduces the variance of the critic network, while dueling networks further decrease variance, enhancing the convergence of the A2C algorithm. Additionally, the improved A2C algorithm does not rely on prior information, and enhances the adaptive capability of the jammer when interacting with the target radar. We conducted numerical simulations based on the designed cognitive jamming decisionmaking model. The results demonstrated that compared with the four algorithms (DQN, Policy Gradient, Actor-Critic, A2C), the convergence speed of the improved A2C algorithm is improved by 50%, 57.8%, 34.5%, and 13.64%, respectively, and verified the excellent performance of the improved A2C. Finally, we introduce the heuristic reward function and propose the heuristic improved A2C algorithm. Compared with the improved A2C algorithm, the convergence speed of this algorithm is improved by 31.58%. The simulation results prove that the algorithm can greatly improve our advantage in electronic countermeasures</abstract><pub>IEEE</pub><doi>10.1109/TVT.2024.3470832</doi><tpages>14</tpages></addata></record>
fulltext fulltext
identifier ISSN: 0018-9545
ispartof IEEE transactions on vehicular technology, 2024-10, p.1-14
issn 0018-9545
1939-9359
language eng
recordid cdi_ieee_primary_10730796
source IEEE Xplore (Online service)
subjects Actor-Critic
Adaptation models
Airborne radar
and heuristic improved A2C
Artificial intelligence
Cognitive electronic warfare
Cognitive radar
Convergence
Decision making
Heuristic algorithms
Jamming
jamming decisionmaking
Q-learning
Radar
reinforcement learning
title A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T12%3A30%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Cognitive%20Jamming%20Decision-Making%20Method%20Based%20on%20Heuristic%20Improved%20A2C%20Algorithm&rft.jtitle=IEEE%20transactions%20on%20vehicular%20technology&rft.au=Zhang,%20Chudi&rft.date=2024-10-22&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=0018-9545&rft.eissn=1939-9359&rft.coden=ITVTAB&rft_id=info:doi/10.1109/TVT.2024.3470832&rft_dat=%3Ccrossref_ieee_%3E10_1109_TVT_2024_3470832%3C/crossref_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c626-b3c31a5c2217e0e93e9c516d78aba490e546813322c997bbfb74dd0e6a7258f93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10730796&rfr_iscdi=true