Loading…

Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts

In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed...

Full description

Saved in:
Bibliographic Details
Published in:Electronics (Basel) 2024-03, Vol.13 (5), p.842
Main Authors: Hannon, Brendan, Kumar, Yulia, Gayle, Dejaun, Li, J. Jenny, Morreale, Patricia
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923
container_end_page
container_issue 5
container_start_page 842
container_title Electronics (Basel)
container_volume 13
creator Hannon, Brendan
Kumar, Yulia
Gayle, Dejaun
Li, J. Jenny
Morreale, Patricia
description In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications.
doi_str_mv 10.3390/electronics13050842
format article
fullrecord <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2955509973</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A786435796</galeid><sourcerecordid>A786435796</sourcerecordid><originalsourceid>FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923</originalsourceid><addsrcrecordid>eNptUMtOwzAQtBBIVKVfwMUS5xQ_4jg-RhWPSi2gqpwjx9kEV2lc7KSof49ROXBg9zCr0czuaBG6pWTOuSL30IEZvOutCZQTQfKUXaAJI1Iliil2-We-RrMQdiSWojznZILWG1eNYcBbCIPtW-waXCzxSvftqFvAa1dDhzcQbGehNyf8ZYcP_OKOkS3qI_igvdUdfvNufxjCDbpqdBdg9otT9P74sF08J6vXp-WiWCWGUzokQoPRVQU1o5SbKhNKZcI0RNZCxvh1w2QlU50TYTIOlCoWMUuNYblKiWJ8iu7Oew_efY4xerlzo-_jyZIpIQRRSvKomp9Vre6gtH3jBq9N7Br21rgeGhv5QuZZyoVUWTTws8F4F4KHpjx4u9f-VFJS_vy6_OfX_BsCIHPU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2955509973</pqid></control><display><type>article</type><title>Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts</title><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Hannon, Brendan ; Kumar, Yulia ; Gayle, Dejaun ; Li, J. Jenny ; Morreale, Patricia</creator><creatorcontrib>Hannon, Brendan ; Kumar, Yulia ; Gayle, Dejaun ; Li, J. Jenny ; Morreale, Patricia</creatorcontrib><description>In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications.</description><identifier>ISSN: 2079-9292</identifier><identifier>EISSN: 2079-9292</identifier><identifier>DOI: 10.3390/electronics13050842</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Algorithms ; Analysis ; Artificial intelligence ; Chatbots ; Cybersecurity ; Data security ; Ethics ; Genre ; Internet ; Language ; Law enforcement ; Motion pictures ; Reliability engineering ; Resilience ; Safety and security measures ; Scripts ; Testing</subject><ispartof>Electronics (Basel), 2024-03, Vol.13 (5), p.842</ispartof><rights>COPYRIGHT 2024 MDPI AG</rights><rights>2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923</cites><orcidid>0000-0002-7621-2734 ; 0000-0002-7954-2122</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2955509973/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2955509973?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25751,27922,27923,37010,44588,74896</link.rule.ids></links><search><creatorcontrib>Hannon, Brendan</creatorcontrib><creatorcontrib>Kumar, Yulia</creatorcontrib><creatorcontrib>Gayle, Dejaun</creatorcontrib><creatorcontrib>Li, J. Jenny</creatorcontrib><creatorcontrib>Morreale, Patricia</creatorcontrib><title>Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts</title><title>Electronics (Basel)</title><description>In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications.</description><subject>Algorithms</subject><subject>Analysis</subject><subject>Artificial intelligence</subject><subject>Chatbots</subject><subject>Cybersecurity</subject><subject>Data security</subject><subject>Ethics</subject><subject>Genre</subject><subject>Internet</subject><subject>Language</subject><subject>Law enforcement</subject><subject>Motion pictures</subject><subject>Reliability engineering</subject><subject>Resilience</subject><subject>Safety and security measures</subject><subject>Scripts</subject><subject>Testing</subject><issn>2079-9292</issn><issn>2079-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNptUMtOwzAQtBBIVKVfwMUS5xQ_4jg-RhWPSi2gqpwjx9kEV2lc7KSof49ROXBg9zCr0czuaBG6pWTOuSL30IEZvOutCZQTQfKUXaAJI1Iliil2-We-RrMQdiSWojznZILWG1eNYcBbCIPtW-waXCzxSvftqFvAa1dDhzcQbGehNyf8ZYcP_OKOkS3qI_igvdUdfvNufxjCDbpqdBdg9otT9P74sF08J6vXp-WiWCWGUzokQoPRVQU1o5SbKhNKZcI0RNZCxvh1w2QlU50TYTIOlCoWMUuNYblKiWJ8iu7Oew_efY4xerlzo-_jyZIpIQRRSvKomp9Vre6gtH3jBq9N7Br21rgeGhv5QuZZyoVUWTTws8F4F4KHpjx4u9f-VFJS_vy6_OfX_BsCIHPU</recordid><startdate>20240301</startdate><enddate>20240301</enddate><creator>Hannon, Brendan</creator><creator>Kumar, Yulia</creator><creator>Gayle, Dejaun</creator><creator>Li, J. Jenny</creator><creator>Morreale, Patricia</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-7621-2734</orcidid><orcidid>https://orcid.org/0000-0002-7954-2122</orcidid></search><sort><creationdate>20240301</creationdate><title>Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts</title><author>Hannon, Brendan ; Kumar, Yulia ; Gayle, Dejaun ; Li, J. Jenny ; Morreale, Patricia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Analysis</topic><topic>Artificial intelligence</topic><topic>Chatbots</topic><topic>Cybersecurity</topic><topic>Data security</topic><topic>Ethics</topic><topic>Genre</topic><topic>Internet</topic><topic>Language</topic><topic>Law enforcement</topic><topic>Motion pictures</topic><topic>Reliability engineering</topic><topic>Resilience</topic><topic>Safety and security measures</topic><topic>Scripts</topic><topic>Testing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hannon, Brendan</creatorcontrib><creatorcontrib>Kumar, Yulia</creatorcontrib><creatorcontrib>Gayle, Dejaun</creatorcontrib><creatorcontrib>Li, J. Jenny</creatorcontrib><creatorcontrib>Morreale, Patricia</creatorcontrib><collection>CrossRef</collection><collection>Electronics &amp; Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies &amp; Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest advanced technologies &amp; aerospace journals</collection><collection>ProQuest Advanced Technologies &amp; Aerospace Collection</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Electronics (Basel)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hannon, Brendan</au><au>Kumar, Yulia</au><au>Gayle, Dejaun</au><au>Li, J. Jenny</au><au>Morreale, Patricia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts</atitle><jtitle>Electronics (Basel)</jtitle><date>2024-03-01</date><risdate>2024</risdate><volume>13</volume><issue>5</issue><spage>842</spage><pages>842-</pages><issn>2079-9292</issn><eissn>2079-9292</eissn><abstract>In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/electronics13050842</doi><orcidid>https://orcid.org/0000-0002-7621-2734</orcidid><orcidid>https://orcid.org/0000-0002-7954-2122</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2079-9292
ispartof Electronics (Basel), 2024-03, Vol.13 (5), p.842
issn 2079-9292
2079-9292
language eng
recordid cdi_proquest_journals_2955509973
source Publicly Available Content Database (Proquest) (PQ_SDU_P3)
subjects Algorithms
Analysis
Artificial intelligence
Chatbots
Cybersecurity
Data security
Ethics
Genre
Internet
Language
Law enforcement
Motion pictures
Reliability engineering
Resilience
Safety and security measures
Scripts
Testing
title Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T14%3A34%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20Testing%20of%20AI%20Language%20Model%20Resiliency%20with%20Novel%20Adversarial%20Prompts&rft.jtitle=Electronics%20(Basel)&rft.au=Hannon,%20Brendan&rft.date=2024-03-01&rft.volume=13&rft.issue=5&rft.spage=842&rft.pages=842-&rft.issn=2079-9292&rft.eissn=2079-9292&rft_id=info:doi/10.3390/electronics13050842&rft_dat=%3Cgale_proqu%3EA786435796%3C/gale_proqu%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2955509973&rft_id=info:pmid/&rft_galeid=A786435796&rfr_iscdi=true