Loading…
Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts
In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed...
Saved in:
Published in: | Electronics (Basel) 2024-03, Vol.13 (5), p.842 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923 |
container_end_page | |
container_issue | 5 |
container_start_page | 842 |
container_title | Electronics (Basel) |
container_volume | 13 |
creator | Hannon, Brendan Kumar, Yulia Gayle, Dejaun Li, J. Jenny Morreale, Patricia |
description | In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications. |
doi_str_mv | 10.3390/electronics13050842 |
format | article |
fullrecord | <record><control><sourceid>gale_proqu</sourceid><recordid>TN_cdi_proquest_journals_2955509973</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><galeid>A786435796</galeid><sourcerecordid>A786435796</sourcerecordid><originalsourceid>FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923</originalsourceid><addsrcrecordid>eNptUMtOwzAQtBBIVKVfwMUS5xQ_4jg-RhWPSi2gqpwjx9kEV2lc7KSof49ROXBg9zCr0czuaBG6pWTOuSL30IEZvOutCZQTQfKUXaAJI1Iliil2-We-RrMQdiSWojznZILWG1eNYcBbCIPtW-waXCzxSvftqFvAa1dDhzcQbGehNyf8ZYcP_OKOkS3qI_igvdUdfvNufxjCDbpqdBdg9otT9P74sF08J6vXp-WiWCWGUzokQoPRVQU1o5SbKhNKZcI0RNZCxvh1w2QlU50TYTIOlCoWMUuNYblKiWJ8iu7Oew_efY4xerlzo-_jyZIpIQRRSvKomp9Vre6gtH3jBq9N7Br21rgeGhv5QuZZyoVUWTTws8F4F4KHpjx4u9f-VFJS_vy6_OfX_BsCIHPU</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2955509973</pqid></control><display><type>article</type><title>Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts</title><source>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</source><creator>Hannon, Brendan ; Kumar, Yulia ; Gayle, Dejaun ; Li, J. Jenny ; Morreale, Patricia</creator><creatorcontrib>Hannon, Brendan ; Kumar, Yulia ; Gayle, Dejaun ; Li, J. Jenny ; Morreale, Patricia</creatorcontrib><description>In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications.</description><identifier>ISSN: 2079-9292</identifier><identifier>EISSN: 2079-9292</identifier><identifier>DOI: 10.3390/electronics13050842</identifier><language>eng</language><publisher>Basel: MDPI AG</publisher><subject>Algorithms ; Analysis ; Artificial intelligence ; Chatbots ; Cybersecurity ; Data security ; Ethics ; Genre ; Internet ; Language ; Law enforcement ; Motion pictures ; Reliability engineering ; Resilience ; Safety and security measures ; Scripts ; Testing</subject><ispartof>Electronics (Basel), 2024-03, Vol.13 (5), p.842</ispartof><rights>COPYRIGHT 2024 MDPI AG</rights><rights>2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923</cites><orcidid>0000-0002-7621-2734 ; 0000-0002-7954-2122</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/2955509973/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$Hfree_for_read</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/2955509973?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,25751,27922,27923,37010,44588,74896</link.rule.ids></links><search><creatorcontrib>Hannon, Brendan</creatorcontrib><creatorcontrib>Kumar, Yulia</creatorcontrib><creatorcontrib>Gayle, Dejaun</creatorcontrib><creatorcontrib>Li, J. Jenny</creatorcontrib><creatorcontrib>Morreale, Patricia</creatorcontrib><title>Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts</title><title>Electronics (Basel)</title><description>In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications.</description><subject>Algorithms</subject><subject>Analysis</subject><subject>Artificial intelligence</subject><subject>Chatbots</subject><subject>Cybersecurity</subject><subject>Data security</subject><subject>Ethics</subject><subject>Genre</subject><subject>Internet</subject><subject>Language</subject><subject>Law enforcement</subject><subject>Motion pictures</subject><subject>Reliability engineering</subject><subject>Resilience</subject><subject>Safety and security measures</subject><subject>Scripts</subject><subject>Testing</subject><issn>2079-9292</issn><issn>2079-9292</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNptUMtOwzAQtBBIVKVfwMUS5xQ_4jg-RhWPSi2gqpwjx9kEV2lc7KSof49ROXBg9zCr0czuaBG6pWTOuSL30IEZvOutCZQTQfKUXaAJI1Iliil2-We-RrMQdiSWojznZILWG1eNYcBbCIPtW-waXCzxSvftqFvAa1dDhzcQbGehNyf8ZYcP_OKOkS3qI_igvdUdfvNufxjCDbpqdBdg9otT9P74sF08J6vXp-WiWCWGUzokQoPRVQU1o5SbKhNKZcI0RNZCxvh1w2QlU50TYTIOlCoWMUuNYblKiWJ8iu7Oew_efY4xerlzo-_jyZIpIQRRSvKomp9Vre6gtH3jBq9N7Br21rgeGhv5QuZZyoVUWTTws8F4F4KHpjx4u9f-VFJS_vy6_OfX_BsCIHPU</recordid><startdate>20240301</startdate><enddate>20240301</enddate><creator>Hannon, Brendan</creator><creator>Kumar, Yulia</creator><creator>Gayle, Dejaun</creator><creator>Li, J. Jenny</creator><creator>Morreale, Patricia</creator><general>MDPI AG</general><scope>AAYXX</scope><scope>CITATION</scope><scope>7SP</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L7M</scope><scope>P5Z</scope><scope>P62</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><orcidid>https://orcid.org/0000-0002-7621-2734</orcidid><orcidid>https://orcid.org/0000-0002-7954-2122</orcidid></search><sort><creationdate>20240301</creationdate><title>Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts</title><author>Hannon, Brendan ; Kumar, Yulia ; Gayle, Dejaun ; Li, J. Jenny ; Morreale, Patricia</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Analysis</topic><topic>Artificial intelligence</topic><topic>Chatbots</topic><topic>Cybersecurity</topic><topic>Data security</topic><topic>Ethics</topic><topic>Genre</topic><topic>Internet</topic><topic>Language</topic><topic>Law enforcement</topic><topic>Motion pictures</topic><topic>Reliability engineering</topic><topic>Resilience</topic><topic>Safety and security measures</topic><topic>Scripts</topic><topic>Testing</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Hannon, Brendan</creatorcontrib><creatorcontrib>Kumar, Yulia</creatorcontrib><creatorcontrib>Gayle, Dejaun</creatorcontrib><creatorcontrib>Li, J. Jenny</creatorcontrib><creatorcontrib>Morreale, Patricia</creatorcontrib><collection>CrossRef</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>AUTh Library subscriptions: ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>ProQuest advanced technologies & aerospace journals</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>Publicly Available Content Database (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><jtitle>Electronics (Basel)</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Hannon, Brendan</au><au>Kumar, Yulia</au><au>Gayle, Dejaun</au><au>Li, J. Jenny</au><au>Morreale, Patricia</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts</atitle><jtitle>Electronics (Basel)</jtitle><date>2024-03-01</date><risdate>2024</risdate><volume>13</volume><issue>5</issue><spage>842</spage><pages>842-</pages><issn>2079-9292</issn><eissn>2079-9292</eissn><abstract>In the rapidly advancing field of Artificial Intelligence (AI), this study presents a critical evaluation of the resilience and cybersecurity efficacy of leading AI models, including ChatGPT-4, Bard, Claude, and Microsoft Copilot. Central to this research are innovative adversarial prompts designed to rigorously test the content moderation capabilities of these AI systems. This study introduces new adversarial tests and the Response Quality Score (RQS), a metric specifically developed to assess the nuances of AI responses. Additionally, the research spotlights FreedomGPT, an AI tool engineered to optimize the alignment between user intent and AI interpretation. The empirical results from this investigation are pivotal for assessing AI models’ current robustness and security. They highlight the necessity for ongoing development and meticulous testing to bolster AI defenses against various adversarial challenges. Notably, this study also delves into the ethical and societal implications of employing advanced “jailbreak” techniques in AI testing. The findings are significant for understanding AI vulnerabilities and formulating strategies to enhance AI technologies’ reliability and ethical soundness, paving the way for safer and more secure AI applications.</abstract><cop>Basel</cop><pub>MDPI AG</pub><doi>10.3390/electronics13050842</doi><orcidid>https://orcid.org/0000-0002-7621-2734</orcidid><orcidid>https://orcid.org/0000-0002-7954-2122</orcidid><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 2079-9292 |
ispartof | Electronics (Basel), 2024-03, Vol.13 (5), p.842 |
issn | 2079-9292 2079-9292 |
language | eng |
recordid | cdi_proquest_journals_2955509973 |
source | Publicly Available Content Database (Proquest) (PQ_SDU_P3) |
subjects | Algorithms Analysis Artificial intelligence Chatbots Cybersecurity Data security Ethics Genre Internet Language Law enforcement Motion pictures Reliability engineering Resilience Safety and security measures Scripts Testing |
title | Robust Testing of AI Language Model Resiliency with Novel Adversarial Prompts |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-09T14%3A34%3A29IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-gale_proqu&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Robust%20Testing%20of%20AI%20Language%20Model%20Resiliency%20with%20Novel%20Adversarial%20Prompts&rft.jtitle=Electronics%20(Basel)&rft.au=Hannon,%20Brendan&rft.date=2024-03-01&rft.volume=13&rft.issue=5&rft.spage=842&rft.pages=842-&rft.issn=2079-9292&rft.eissn=2079-9292&rft_id=info:doi/10.3390/electronics13050842&rft_dat=%3Cgale_proqu%3EA786435796%3C/gale_proqu%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c311t-5aecabbed2113cb659965cf07d57084df27b74a805c63e1192c6364cc28940923%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2955509973&rft_id=info:pmid/&rft_galeid=A786435796&rfr_iscdi=true |