Loading…
ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination
While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a co...
Saved in:
Published in: | arXiv.org 2024-09 |
---|---|
Main Authors: | , , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Ayoobi, Navid Knab, Lily Cheng, Wen Pantoja, David Alikhani, Hamidreza Flamant, Sylvain Kim, Jin Mukherjee, Arjun |
description | While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the manipulated text retains the original semantics while significantly reducing the true positive rate (TPR) of existing detection methods. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems, revealing their susceptibility to back-translation manipulation. In response to the identified shortcomings of existing AI text detectors, we present a countermeasure to improve the robustness against this form of manipulation. Our results indicate that the TPR of the proposed method declines by only 1.85% after back-translation manipulation. Furthermore, we build a large dataset of 720k texts using eight different LLMs. Our dataset contains both human-authored and LLM-generated texts in various domains and writing styles to assess the performance of our method and existing detectors. This dataset is publicly shared for the benefit of the research community. |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3108869875</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3108869875</sourcerecordid><originalsourceid>FETCH-proquest_journals_31088698753</originalsourceid><addsrcrecordid>eNqNitEKAUEUQCeliP2HW57VmrF2eRMjXhD7rrEuO9Id5s4KX4_yAZ5OnXNqoimV6nWzvpQNETGf4ziWg1QmiWqKg96u9Wa8zFcj0HdzqUywdILtk0KJbF94gHXpDSNDcKCpNFQgbNy-4kDIDJZgvIApBiyCdQRH5yHHR4CVtydL5ivbon40F8box5bozHQ-mXev3t0q5LA7u8rTJ-1UL86ywTBLE_Xf9QZyr0V-</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3108869875</pqid></control><display><type>article</type><title>ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination</title><source>Publicly Available Content Database</source><creator>Ayoobi, Navid ; Knab, Lily ; Cheng, Wen ; Pantoja, David ; Alikhani, Hamidreza ; Flamant, Sylvain ; Kim, Jin ; Mukherjee, Arjun</creator><creatorcontrib>Ayoobi, Navid ; Knab, Lily ; Cheng, Wen ; Pantoja, David ; Alikhani, Hamidreza ; Flamant, Sylvain ; Kim, Jin ; Mukherjee, Arjun</creatorcontrib><description>While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the manipulated text retains the original semantics while significantly reducing the true positive rate (TPR) of existing detection methods. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems, revealing their susceptibility to back-translation manipulation. In response to the identified shortcomings of existing AI text detectors, we present a countermeasure to improve the robustness against this form of manipulation. Our results indicate that the TPR of the proposed method declines by only 1.85% after back-translation manipulation. Furthermore, we build a large dataset of 720k texts using eight different LLMs. Our dataset contains both human-authored and LLM-generated texts in various domains and writing styles to assess the performance of our method and existing detectors. This dataset is publicly shared for the benefit of the research community.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Datasets ; Detectors ; Large language models ; Robustness ; Semantics ; Sensors ; Texts ; Translating</subject><ispartof>arXiv.org, 2024-09</ispartof><rights>2024. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3108869875?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>778,782,25736,36995,44573</link.rule.ids></links><search><creatorcontrib>Ayoobi, Navid</creatorcontrib><creatorcontrib>Knab, Lily</creatorcontrib><creatorcontrib>Cheng, Wen</creatorcontrib><creatorcontrib>Pantoja, David</creatorcontrib><creatorcontrib>Alikhani, Hamidreza</creatorcontrib><creatorcontrib>Flamant, Sylvain</creatorcontrib><creatorcontrib>Kim, Jin</creatorcontrib><creatorcontrib>Mukherjee, Arjun</creatorcontrib><title>ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination</title><title>arXiv.org</title><description>While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the manipulated text retains the original semantics while significantly reducing the true positive rate (TPR) of existing detection methods. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems, revealing their susceptibility to back-translation manipulation. In response to the identified shortcomings of existing AI text detectors, we present a countermeasure to improve the robustness against this form of manipulation. Our results indicate that the TPR of the proposed method declines by only 1.85% after back-translation manipulation. Furthermore, we build a large dataset of 720k texts using eight different LLMs. Our dataset contains both human-authored and LLM-generated texts in various domains and writing styles to assess the performance of our method and existing detectors. This dataset is publicly shared for the benefit of the research community.</description><subject>Datasets</subject><subject>Detectors</subject><subject>Large language models</subject><subject>Robustness</subject><subject>Semantics</subject><subject>Sensors</subject><subject>Texts</subject><subject>Translating</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNitEKAUEUQCeliP2HW57VmrF2eRMjXhD7rrEuO9Id5s4KX4_yAZ5OnXNqoimV6nWzvpQNETGf4ziWg1QmiWqKg96u9Wa8zFcj0HdzqUywdILtk0KJbF94gHXpDSNDcKCpNFQgbNy-4kDIDJZgvIApBiyCdQRH5yHHR4CVtydL5ivbon40F8box5bozHQ-mXev3t0q5LA7u8rTJ-1UL86ywTBLE_Xf9QZyr0V-</recordid><startdate>20240922</startdate><enddate>20240922</enddate><creator>Ayoobi, Navid</creator><creator>Knab, Lily</creator><creator>Cheng, Wen</creator><creator>Pantoja, David</creator><creator>Alikhani, Hamidreza</creator><creator>Flamant, Sylvain</creator><creator>Kim, Jin</creator><creator>Mukherjee, Arjun</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20240922</creationdate><title>ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination</title><author>Ayoobi, Navid ; Knab, Lily ; Cheng, Wen ; Pantoja, David ; Alikhani, Hamidreza ; Flamant, Sylvain ; Kim, Jin ; Mukherjee, Arjun</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31088698753</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Datasets</topic><topic>Detectors</topic><topic>Large language models</topic><topic>Robustness</topic><topic>Semantics</topic><topic>Sensors</topic><topic>Texts</topic><topic>Translating</topic><toplevel>online_resources</toplevel><creatorcontrib>Ayoobi, Navid</creatorcontrib><creatorcontrib>Knab, Lily</creatorcontrib><creatorcontrib>Cheng, Wen</creatorcontrib><creatorcontrib>Pantoja, David</creatorcontrib><creatorcontrib>Alikhani, Hamidreza</creatorcontrib><creatorcontrib>Flamant, Sylvain</creatorcontrib><creatorcontrib>Kim, Jin</creatorcontrib><creatorcontrib>Mukherjee, Arjun</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central Korea</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ayoobi, Navid</au><au>Knab, Lily</au><au>Cheng, Wen</au><au>Pantoja, David</au><au>Alikhani, Hamidreza</au><au>Flamant, Sylvain</au><au>Kim, Jin</au><au>Mukherjee, Arjun</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination</atitle><jtitle>arXiv.org</jtitle><date>2024-09-22</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>While large language models (LLMs) exhibit significant utility across various domains, they simultaneously are susceptible to exploitation for unethical purposes, including academic misconduct and dissemination of misinformation. Consequently, AI-generated text detection systems have emerged as a countermeasure. However, these detection mechanisms demonstrate vulnerability to evasion techniques and lack robustness against textual manipulations. This paper introduces back-translation as a novel technique for evading detection, underscoring the need to enhance the robustness of current detection systems. The proposed method involves translating AI-generated text through multiple languages before back-translating to English. We present a model that combines these back-translated texts to produce a manipulated version of the original AI-generated text. Our findings demonstrate that the manipulated text retains the original semantics while significantly reducing the true positive rate (TPR) of existing detection methods. We evaluate this technique on nine AI detectors, including six open-source and three proprietary systems, revealing their susceptibility to back-translation manipulation. In response to the identified shortcomings of existing AI text detectors, we present a countermeasure to improve the robustness against this form of manipulation. Our results indicate that the TPR of the proposed method declines by only 1.85% after back-translation manipulation. Furthermore, we build a large dataset of 720k texts using eight different LLMs. Our dataset contains both human-authored and LLM-generated texts in various domains and writing styles to assess the performance of our method and existing detectors. This dataset is publicly shared for the benefit of the research community.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2024-09 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_3108869875 |
source | Publicly Available Content Database |
subjects | Datasets Detectors Large language models Robustness Semantics Sensors Texts Translating |
title | ESPERANTO: Evaluating Synthesized Phrases to Enhance Robustness in AI Detection for Text Origination |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-16T23%3A11%3A32IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=ESPERANTO:%20Evaluating%20Synthesized%20Phrases%20to%20Enhance%20Robustness%20in%20AI%20Detection%20for%20Text%20Origination&rft.jtitle=arXiv.org&rft.au=Ayoobi,%20Navid&rft.date=2024-09-22&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3108869875%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31088698753%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3108869875&rft_id=info:pmid/&rfr_iscdi=true |