Loading…
A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm
Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce th...
Saved in:
Published in: | IEEE transactions on vehicular technology 2024-10, p.1-14 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 14 |
container_issue | |
container_start_page | 1 |
container_title | IEEE transactions on vehicular technology |
container_volume | |
creator | Zhang, Chudi Yang, Biao Wang, Lei Ji, Wenshuai Wang, Lulu Xu, Shiyou |
description | Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce the A2C algorithm into cognitive jamming decision-making and propose a heuristic improved A2C algorithm. As a fusion algorithm of DQN and Policy Gradient, on the one hand, the Actor-Critic algorithm has the advantages of iterative updating of policies and high efficiency in complex spaces, compared to the DQN algorithm. On the other hand, compared with the Policy Gradient algorithm, it features fast convergence. But, it suffers from high variance, making convergence difficult. First, we establish a cognitive jamming decision-making model to address the above issues. Then, we develop an improved A2C algorithm by introducing a baseline and dueling networks. The baseline reduces the variance of the critic network, while dueling networks further decrease variance, enhancing the convergence of the A2C algorithm. Additionally, the improved A2C algorithm does not rely on prior information, and enhances the adaptive capability of the jammer when interacting with the target radar. We conducted numerical simulations based on the designed cognitive jamming decisionmaking model. The results demonstrated that compared with the four algorithms (DQN, Policy Gradient, Actor-Critic, A2C), the convergence speed of the improved A2C algorithm is improved by 50%, 57.8%, 34.5%, and 13.64%, respectively, and verified the excellent performance of the improved A2C. Finally, we introduce the heuristic reward function and propose the heuristic improved A2C algorithm. Compared with the improved A2C algorithm, the convergence speed of this algorithm is improved by 31.58%. The simulation results prove that the algorithm can greatly improve our advantage in electronic countermeasures |
doi_str_mv | 10.1109/TVT.2024.3470832 |
format | article |
fullrecord | <record><control><sourceid>crossref_ieee_</sourceid><recordid>TN_cdi_ieee_primary_10730796</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10730796</ieee_id><sourcerecordid>10_1109_TVT_2024_3470832</sourcerecordid><originalsourceid>FETCH-LOGICAL-c626-b3c31a5c2217e0e93e9c516d78aba490e546813322c997bbfb74dd0e6a7258f93</originalsourceid><addsrcrecordid>eNpNkD1PwzAYhC0EEqWwMzD4DyT4M47HED5a1IolYo0c501raOLKDpX496RqB6bTne5ueBC6pySllOjH6rNKGWEi5UKRnLMLNKOa60RzqS_RjBCaJ1oKeY1uYvyarBCazlBV4NJvBje6A-B30_du2OBnsC46PyRr8330axi3vsVPJkKL_YAX8BNcHJ3Fy34f_GFKC1biYrfxwY3b_hZddWYX4e6sc1S9vlTlIll9vC3LYpXYjGVJwy2nRlrGqAICmoO2kmatyk1jhCYgRZZTzhmzWqum6Rol2pZAZhSTeaf5HJHTrQ0-xgBdvQ-uN-G3pqQ-QqknKPURSn2GMk0eThMHAP_qihOlM_4H_ZddJw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm</title><source>IEEE Xplore (Online service)</source><creator>Zhang, Chudi ; Yang, Biao ; Wang, Lei ; Ji, Wenshuai ; Wang, Lulu ; Xu, Shiyou</creator><creatorcontrib>Zhang, Chudi ; Yang, Biao ; Wang, Lei ; Ji, Wenshuai ; Wang, Lulu ; Xu, Shiyou</creatorcontrib><description>Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce the A2C algorithm into cognitive jamming decision-making and propose a heuristic improved A2C algorithm. As a fusion algorithm of DQN and Policy Gradient, on the one hand, the Actor-Critic algorithm has the advantages of iterative updating of policies and high efficiency in complex spaces, compared to the DQN algorithm. On the other hand, compared with the Policy Gradient algorithm, it features fast convergence. But, it suffers from high variance, making convergence difficult. First, we establish a cognitive jamming decision-making model to address the above issues. Then, we develop an improved A2C algorithm by introducing a baseline and dueling networks. The baseline reduces the variance of the critic network, while dueling networks further decrease variance, enhancing the convergence of the A2C algorithm. Additionally, the improved A2C algorithm does not rely on prior information, and enhances the adaptive capability of the jammer when interacting with the target radar. We conducted numerical simulations based on the designed cognitive jamming decisionmaking model. The results demonstrated that compared with the four algorithms (DQN, Policy Gradient, Actor-Critic, A2C), the convergence speed of the improved A2C algorithm is improved by 50%, 57.8%, 34.5%, and 13.64%, respectively, and verified the excellent performance of the improved A2C. Finally, we introduce the heuristic reward function and propose the heuristic improved A2C algorithm. Compared with the improved A2C algorithm, the convergence speed of this algorithm is improved by 31.58%. The simulation results prove that the algorithm can greatly improve our advantage in electronic countermeasures</description><identifier>ISSN: 0018-9545</identifier><identifier>EISSN: 1939-9359</identifier><identifier>DOI: 10.1109/TVT.2024.3470832</identifier><identifier>CODEN: ITVTAB</identifier><language>eng</language><publisher>IEEE</publisher><subject>Actor-Critic ; Adaptation models ; Airborne radar ; and heuristic improved A2C ; Artificial intelligence ; Cognitive electronic warfare ; Cognitive radar ; Convergence ; Decision making ; Heuristic algorithms ; Jamming ; jamming decisionmaking ; Q-learning ; Radar ; reinforcement learning</subject><ispartof>IEEE transactions on vehicular technology, 2024-10, p.1-14</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10730796$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Zhang, Chudi</creatorcontrib><creatorcontrib>Yang, Biao</creatorcontrib><creatorcontrib>Wang, Lei</creatorcontrib><creatorcontrib>Ji, Wenshuai</creatorcontrib><creatorcontrib>Wang, Lulu</creatorcontrib><creatorcontrib>Xu, Shiyou</creatorcontrib><title>A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm</title><title>IEEE transactions on vehicular technology</title><addtitle>TVT</addtitle><description>Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce the A2C algorithm into cognitive jamming decision-making and propose a heuristic improved A2C algorithm. As a fusion algorithm of DQN and Policy Gradient, on the one hand, the Actor-Critic algorithm has the advantages of iterative updating of policies and high efficiency in complex spaces, compared to the DQN algorithm. On the other hand, compared with the Policy Gradient algorithm, it features fast convergence. But, it suffers from high variance, making convergence difficult. First, we establish a cognitive jamming decision-making model to address the above issues. Then, we develop an improved A2C algorithm by introducing a baseline and dueling networks. The baseline reduces the variance of the critic network, while dueling networks further decrease variance, enhancing the convergence of the A2C algorithm. Additionally, the improved A2C algorithm does not rely on prior information, and enhances the adaptive capability of the jammer when interacting with the target radar. We conducted numerical simulations based on the designed cognitive jamming decisionmaking model. The results demonstrated that compared with the four algorithms (DQN, Policy Gradient, Actor-Critic, A2C), the convergence speed of the improved A2C algorithm is improved by 50%, 57.8%, 34.5%, and 13.64%, respectively, and verified the excellent performance of the improved A2C. Finally, we introduce the heuristic reward function and propose the heuristic improved A2C algorithm. Compared with the improved A2C algorithm, the convergence speed of this algorithm is improved by 31.58%. The simulation results prove that the algorithm can greatly improve our advantage in electronic countermeasures</description><subject>Actor-Critic</subject><subject>Adaptation models</subject><subject>Airborne radar</subject><subject>and heuristic improved A2C</subject><subject>Artificial intelligence</subject><subject>Cognitive electronic warfare</subject><subject>Cognitive radar</subject><subject>Convergence</subject><subject>Decision making</subject><subject>Heuristic algorithms</subject><subject>Jamming</subject><subject>jamming decisionmaking</subject><subject>Q-learning</subject><subject>Radar</subject><subject>reinforcement learning</subject><issn>0018-9545</issn><issn>1939-9359</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNpNkD1PwzAYhC0EEqWwMzD4DyT4M47HED5a1IolYo0c501raOLKDpX496RqB6bTne5ueBC6pySllOjH6rNKGWEi5UKRnLMLNKOa60RzqS_RjBCaJ1oKeY1uYvyarBCazlBV4NJvBje6A-B30_du2OBnsC46PyRr8330axi3vsVPJkKL_YAX8BNcHJ3Fy34f_GFKC1biYrfxwY3b_hZddWYX4e6sc1S9vlTlIll9vC3LYpXYjGVJwy2nRlrGqAICmoO2kmatyk1jhCYgRZZTzhmzWqum6Rol2pZAZhSTeaf5HJHTrQ0-xgBdvQ-uN-G3pqQ-QqknKPURSn2GMk0eThMHAP_qihOlM_4H_ZddJw</recordid><startdate>20241022</startdate><enddate>20241022</enddate><creator>Zhang, Chudi</creator><creator>Yang, Biao</creator><creator>Wang, Lei</creator><creator>Ji, Wenshuai</creator><creator>Wang, Lulu</creator><creator>Xu, Shiyou</creator><general>IEEE</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20241022</creationdate><title>A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm</title><author>Zhang, Chudi ; Yang, Biao ; Wang, Lei ; Ji, Wenshuai ; Wang, Lulu ; Xu, Shiyou</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c626-b3c31a5c2217e0e93e9c516d78aba490e546813322c997bbfb74dd0e6a7258f93</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Actor-Critic</topic><topic>Adaptation models</topic><topic>Airborne radar</topic><topic>and heuristic improved A2C</topic><topic>Artificial intelligence</topic><topic>Cognitive electronic warfare</topic><topic>Cognitive radar</topic><topic>Convergence</topic><topic>Decision making</topic><topic>Heuristic algorithms</topic><topic>Jamming</topic><topic>jamming decisionmaking</topic><topic>Q-learning</topic><topic>Radar</topic><topic>reinforcement learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Chudi</creatorcontrib><creatorcontrib>Yang, Biao</creatorcontrib><creatorcontrib>Wang, Lei</creatorcontrib><creatorcontrib>Ji, Wenshuai</creatorcontrib><creatorcontrib>Wang, Lulu</creatorcontrib><creatorcontrib>Xu, Shiyou</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore (Online service)</collection><collection>CrossRef</collection><jtitle>IEEE transactions on vehicular technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Chudi</au><au>Yang, Biao</au><au>Wang, Lei</au><au>Ji, Wenshuai</au><au>Wang, Lulu</au><au>Xu, Shiyou</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm</atitle><jtitle>IEEE transactions on vehicular technology</jtitle><stitle>TVT</stitle><date>2024-10-22</date><risdate>2024</risdate><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>0018-9545</issn><eissn>1939-9359</eissn><coden>ITVTAB</coden><abstract>Cognitive electronic warfare (CEW) has received increasing attention, and it is widely recognized that it will play a significant role. Cognitive jamming decision-making, as one of the critical technologies of CEW, dramatically impacts the global battlefield situation. In this paper, we introduce the A2C algorithm into cognitive jamming decision-making and propose a heuristic improved A2C algorithm. As a fusion algorithm of DQN and Policy Gradient, on the one hand, the Actor-Critic algorithm has the advantages of iterative updating of policies and high efficiency in complex spaces, compared to the DQN algorithm. On the other hand, compared with the Policy Gradient algorithm, it features fast convergence. But, it suffers from high variance, making convergence difficult. First, we establish a cognitive jamming decision-making model to address the above issues. Then, we develop an improved A2C algorithm by introducing a baseline and dueling networks. The baseline reduces the variance of the critic network, while dueling networks further decrease variance, enhancing the convergence of the A2C algorithm. Additionally, the improved A2C algorithm does not rely on prior information, and enhances the adaptive capability of the jammer when interacting with the target radar. We conducted numerical simulations based on the designed cognitive jamming decisionmaking model. The results demonstrated that compared with the four algorithms (DQN, Policy Gradient, Actor-Critic, A2C), the convergence speed of the improved A2C algorithm is improved by 50%, 57.8%, 34.5%, and 13.64%, respectively, and verified the excellent performance of the improved A2C. Finally, we introduce the heuristic reward function and propose the heuristic improved A2C algorithm. Compared with the improved A2C algorithm, the convergence speed of this algorithm is improved by 31.58%. The simulation results prove that the algorithm can greatly improve our advantage in electronic countermeasures</abstract><pub>IEEE</pub><doi>10.1109/TVT.2024.3470832</doi><tpages>14</tpages></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0018-9545 |
ispartof | IEEE transactions on vehicular technology, 2024-10, p.1-14 |
issn | 0018-9545 1939-9359 |
language | eng |
recordid | cdi_ieee_primary_10730796 |
source | IEEE Xplore (Online service) |
subjects | Actor-Critic Adaptation models Airborne radar and heuristic improved A2C Artificial intelligence Cognitive electronic warfare Cognitive radar Convergence Decision making Heuristic algorithms Jamming jamming decisionmaking Q-learning Radar reinforcement learning |
title | A Cognitive Jamming Decision-Making Method Based on Heuristic Improved A2C Algorithm |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-28T12%3A30%3A56IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=A%20Cognitive%20Jamming%20Decision-Making%20Method%20Based%20on%20Heuristic%20Improved%20A2C%20Algorithm&rft.jtitle=IEEE%20transactions%20on%20vehicular%20technology&rft.au=Zhang,%20Chudi&rft.date=2024-10-22&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=0018-9545&rft.eissn=1939-9359&rft.coden=ITVTAB&rft_id=info:doi/10.1109/TVT.2024.3470832&rft_dat=%3Ccrossref_ieee_%3E10_1109_TVT_2024_3470832%3C/crossref_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c626-b3c31a5c2217e0e93e9c516d78aba490e546813322c997bbfb74dd0e6a7258f93%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10730796&rfr_iscdi=true |