Loading…

ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination

While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a co...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2024-09
Main Authors:	Ayoobi, Navid, Knab, Lily, Cheng, Wen, Pantoja, David, Alikhani, Hamidreza, Flamant, Sylvain, Kim, Jin, Mukherjee, Arjun
Format:	Article
Language:	English
Subjects:	Datasets Detectors Large language models Robustness Semantics Sensors Texts Translating
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites
container_end_page
container_issue
container_start_page
container_title	arXiv.org
container_volume
creator	Ayoobi, Navid Knab, Lily Cheng, Wen Pantoja, David Alikhani, Hamidreza Flamant, Sylvain Kim, Jin Mukherjee, Arjun
description	While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the manipulated text retains the original semantics while significantly reducing the true positive rate (TPR) of existing detection methods. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems, revealing their susceptibility to back-translation manipulation. In response to the identified shortcomings of existing AI text detectors, we present a countermeasure to improve the robustness against this form of manipulation. Our results indicate that the TPR of the proposed method declines by only 1.85% after back-translation manipulation. Furthermore, we build a large dataset of 720k texts using eight different LLMs. Our dataset contains both human-authored and LLM-generated texts in various domains and writing styles to assess the performance of our method and existing detectors. This dataset is publicly shared for the benefit of the research community.
format	article
fullrecord	<record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3108869875</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3108869875</sourcerecordid><originalsourceid>FETCH-proquest_journals_31088698753</originalsourceid><addsrcrecordid>eNqNitEKAUEUQCeliP2HW57VmrF2eRMjXhD7rrEuO9Id5s4KX4_yAZ5OnXNqoimV6nWzvpQNETGf4ziWg1QmiWqKg96u9Wa8zFcj0HdzqUywdILtk0KJbF94gHXpDSNDcKCpNFQgbNy-4kDIDJZgvIApBiyCdQRH5yHHR4CVtydL5ivbon40F8box5bozHQ-mXev3t0q5LA7u8rTJ-1UL86ywTBLE_Xf9QZyr0V-</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3108869875</pqid></control><display><type>article</type><title>ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination</title><source>Publicly Available Content Database</source><creator>Ayoobi, Navid ; Knab, Lily ; Cheng, Wen ; Pantoja, David ; Alikhani, Hamidreza ; Flamant, Sylvain ; Kim, Jin ; Mukherjee, Arjun</creator><creatorcontrib>Ayoobi, Navid ; Knab, Lily ; Cheng, Wen ; Pantoja, David ; Alikhani, Hamidreza ; Flamant, Sylvain ; Kim, Jin ; Mukherjee, Arjun</creatorcontrib><description>While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the manipulated text retains the original semantics while significantly reducing the true positive rate (TPR) of existing detection methods. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems, revealing their susceptibility to back-translation manipulation. In response to the identified shortcomings of existing AI text detectors, we present a countermeasure to improve the robustness against this form of manipulation. Our results indicate that the TPR of the proposed method declines by only 1.85% after back-translation manipulation. Furthermore, we build a large dataset of 720k texts using eight different LLMs. Our dataset contains both human-authored and LLM-generated texts in various domains and writing styles to assess the performance of our method and existing detectors. This dataset is publicly shared for the benefit of the research community.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Datasets ; Detectors ; Large language models ; Robustness ; Semantics ; Sensors ; Texts ; Translating</subject><ispartof>arXiv.org, 2024-09</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3108869875?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>778,782,25736,36995,44573</link.rule.ids></links><search><creatorcontrib>Ayoobi, Navid</creatorcontrib><creatorcontrib>Knab, Lily</creatorcontrib><creatorcontrib>Cheng, Wen</creatorcontrib><creatorcontrib>Pantoja, David</creatorcontrib><creatorcontrib>Alikhani, Hamidreza</creatorcontrib><creatorcontrib>Flamant, Sylvain</creatorcontrib><creatorcontrib>Kim, Jin</creatorcontrib><creatorcontrib>Mukherjee, Arjun</creatorcontrib><title>ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination</title><title>arXiv.org</title><description>While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the manipulated text retains the original semantics while significantly reducing the true positive rate (TPR) of existing detection methods. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems, revealing their susceptibility to back-translation manipulation. In response to the identified shortcomings of existing AI text detectors, we present a countermeasure to improve the robustness against this form of manipulation. Our results indicate that the TPR of the proposed method declines by only 1.85% after back-translation manipulation. Furthermore, we build a large dataset of 720k texts using eight different LLMs. Our dataset contains both human-authored and LLM-generated texts in various domains and writing styles to assess the performance of our method and existing detectors. This dataset is publicly shared for the benefit of the research community.</description><subject>Datasets</subject><subject>Detectors</subject><subject>Large language models</subject><subject>Robustness</subject><subject>Semantics</subject><subject>Sensors</subject><subject>Texts</subject><subject>Translating</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNitEKAUEUQCeliP2HW57VmrF2eRMjXhD7rrEuO9Id5s4KX4_yAZ5OnXNqoimV6nWzvpQNETGf4ziWg1QmiWqKg96u9Wa8zFcj0HdzqUywdILtk0KJbF94gHXpDSNDcKCpNFQgbNy-4kDIDJZgvIApBiyCdQRH5yHHR4CVtydL5ivbon40F8box5bozHQ-mXev3t0q5LA7u8rTJ-1UL86ywTBLE_Xf9QZyr0V-</recordid><startdate>20240922</startdate><enddate>20240922</enddate><creator>Ayoobi, Navid</creator><creator>Knab, Lily</creator><creator>Cheng, Wen</creator><creator>Pantoja, David</creator><creator>Alikhani, Hamidreza</creator><creator>Flamant, Sylvain</creator><creator>Kim, Jin</creator><creator>Mukherjee, Arjun</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240922</creationdate><title>ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination</title><author>Ayoobi, Navid ; Knab, Lily ; Cheng, Wen ; Pantoja, David ; Alikhani, Hamidreza ; Flamant, Sylvain ; Kim, Jin ; Mukherjee, Arjun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31088698753</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Datasets</topic><topic>Detectors</topic><topic>Large language models</topic><topic>Robustness</topic><topic>Semantics</topic><topic>Sensors</topic><topic>Texts</topic><topic>Translating</topic><toplevel>online_resources</toplevel><creatorcontrib>Ayoobi, Navid</creatorcontrib><creatorcontrib>Knab, Lily</creatorcontrib><creatorcontrib>Cheng, Wen</creatorcontrib><creatorcontrib>Pantoja, David</creatorcontrib><creatorcontrib>Alikhani, Hamidreza</creatorcontrib><creatorcontrib>Flamant, Sylvain</creatorcontrib><creatorcontrib>Kim, Jin</creatorcontrib><creatorcontrib>Mukherjee, Arjun</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ayoobi, Navid</au><au>Knab, Lily</au><au>Cheng, Wen</au><au>Pantoja, David</au><au>Alikhani, Hamidreza</au><au>Flamant, Sylvain</au><au>Kim, Jin</au><au>Mukherjee, Arjun</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination</atitle><jtitle>arXiv.org</jtitle><date>2024-09-22</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the manipulated text retains the original semantics while significantly reducing the true positive rate (TPR) of existing detection methods. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems, revealing their susceptibility to back-translation manipulation. In response to the identified shortcomings of existing AI text detectors, we present a countermeasure to improve the robustness against this form of manipulation. Our results indicate that the TPR of the proposed method declines by only 1.85% after back-translation manipulation. Furthermore, we build a large dataset of 720k texts using eight different LLMs. Our dataset contains both human-authored and LLM-generated texts in various domains and writing styles to assess the performance of our method and existing detectors. This dataset is publicly shared for the benefit of the research community.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	EISSN: 2331-8422
ispartof	arXiv.org, 2024-09
issn	2331-8422
language	eng
recordid	cdi_proquest_journals_3108869875
source	Publicly Available Content Database
subjects	Datasets Detectors Large language models Robustness Semantics Sensors Texts Translating
title	ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T23%3A11%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=ESPERANTO:%20Evaluating%20Synthesized%20Phrases%20to%20Enhance%20Robustness%20in%20AI%20Detection%20for%20Text%20Origination&rft.jtitle=arXiv.org&rft.au=Ayoobi,%20Navid&rft.date=2024-09-22&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3108869875%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31088698753%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3108869875&rft_id=info:pmid/&rfr_iscdi=true