Loading…

A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities

Hate speech online remains an understudied issue for marginalized communities, and has seen rising relevance, especially in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized communities living in societies where...

Full description

Saved in:
Bibliographic Details
Published in:arXiv.org 2024-12
Main Authors: Ye, Haotian, Wisiorek, Axel, Maronikolakis, Antonis, Alaçam, Özge, Schütze, Hinrich
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites
container_end_page
container_issue
container_start_page
container_title arXiv.org
container_volume
creator Ye, Haotian
Wisiorek, Axel
Maronikolakis, Antonis
Alaçam, Özge
Schütze, Hinrich
description Hate speech online remains an understudied issue for marginalized communities, and has seen rising relevance, especially in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized communities living in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from hate speech on the internet by filtering offensive content in their native languages. Our contribution in this paper is twofold: 1) we release REACT (REsponsive hate speech datasets Across ConTexts), a collection of high-quality, culture-specific hate speech detection datasets comprising seven distinct target groups in eight low-resource languages, curated by experienced data collectors; 2) we propose a solution to few-shot hate speech detection utilizing federated learning (FL), a privacy-preserving and collaborative learning approach, to continuously improve a central model that exhibits robustness when tackling different target groups and languages. By keeping the training local to the users' devices, we ensure the privacy of the users' data while benefitting from the efficiency of federated learning. Furthermore, we personalize client models to target-specific training data and evaluate their performance. Our results indicate the effectiveness of FL across different target groups, whereas the benefits of personalization on few-shot learning are not clear.
format article
fullrecord <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_3142373769</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3142373769</sourcerecordid><originalsourceid>FETCH-proquest_journals_31423737693</originalsourceid><addsrcrecordid>eNqNjM0KgkAUhYcgSMp3GGgt6Fx_aimWtGllexn0miPq2MyVoKdvFj1AqwPfOd_ZME8ARMEpFmLHfGuHMAxFmokkAY9VOS-xRSMJW54vi9Gy6TlpR99B1WviN1fxakF0_IKEDSk9804bfpfmqWY5qo9zCz1N66xIoT2wbSdHi_4v9-xYXh_FLXDvrxUt1YNejRNtDVEsIIMsPcN_qy-qIT9Z</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3142373769</pqid></control><display><type>article</type><title>A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities</title><source>Publicly Available Content Database</source><creator>Ye, Haotian ; Wisiorek, Axel ; Maronikolakis, Antonis ; Alaçam, Özge ; Schütze, Hinrich</creator><creatorcontrib>Ye, Haotian ; Wisiorek, Axel ; Maronikolakis, Antonis ; Alaçam, Özge ; Schütze, Hinrich</creatorcontrib><description>Hate speech online remains an understudied issue for marginalized communities, and has seen rising relevance, especially in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized communities living in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from hate speech on the internet by filtering offensive content in their native languages. Our contribution in this paper is twofold: 1) we release REACT (REsponsive hate speech datasets Across ConTexts), a collection of high-quality, culture-specific hate speech detection datasets comprising seven distinct target groups in eight low-resource languages, curated by experienced data collectors; 2) we propose a solution to few-shot hate speech detection utilizing federated learning (FL), a privacy-preserving and collaborative learning approach, to continuously improve a central model that exhibits robustness when tackling different target groups and languages. By keeping the training local to the users' devices, we ensure the privacy of the users' data while benefitting from the efficiency of federated learning. Furthermore, we personalize client models to target-specific training data and evaluate their performance. Our results indicate the effectiveness of FL across different target groups, whereas the benefits of personalization on few-shot learning are not clear.</description><identifier>EISSN: 2331-8422</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Datasets ; Federated learning ; Hate speech ; Internet ; Languages ; Privacy ; Target detection</subject><ispartof>arXiv.org, 2024-12</ispartof><rights>2024. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/3142373769?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>780,784,25753,37012,44590</link.rule.ids></links><search><creatorcontrib>Ye, Haotian</creatorcontrib><creatorcontrib>Wisiorek, Axel</creatorcontrib><creatorcontrib>Maronikolakis, Antonis</creatorcontrib><creatorcontrib>Alaçam, Özge</creatorcontrib><creatorcontrib>Schütze, Hinrich</creatorcontrib><title>A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities</title><title>arXiv.org</title><description>Hate speech online remains an understudied issue for marginalized communities, and has seen rising relevance, especially in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized communities living in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from hate speech on the internet by filtering offensive content in their native languages. Our contribution in this paper is twofold: 1) we release REACT (REsponsive hate speech datasets Across ConTexts), a collection of high-quality, culture-specific hate speech detection datasets comprising seven distinct target groups in eight low-resource languages, curated by experienced data collectors; 2) we propose a solution to few-shot hate speech detection utilizing federated learning (FL), a privacy-preserving and collaborative learning approach, to continuously improve a central model that exhibits robustness when tackling different target groups and languages. By keeping the training local to the users' devices, we ensure the privacy of the users' data while benefitting from the efficiency of federated learning. Furthermore, we personalize client models to target-specific training data and evaluate their performance. Our results indicate the effectiveness of FL across different target groups, whereas the benefits of personalization on few-shot learning are not clear.</description><subject>Datasets</subject><subject>Federated learning</subject><subject>Hate speech</subject><subject>Internet</subject><subject>Languages</subject><subject>Privacy</subject><subject>Target detection</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNqNjM0KgkAUhYcgSMp3GGgt6Fx_aimWtGllexn0miPq2MyVoKdvFj1AqwPfOd_ZME8ARMEpFmLHfGuHMAxFmokkAY9VOS-xRSMJW54vi9Gy6TlpR99B1WviN1fxakF0_IKEDSk9804bfpfmqWY5qo9zCz1N66xIoT2wbSdHi_4v9-xYXh_FLXDvrxUt1YNejRNtDVEsIIMsPcN_qy-qIT9Z</recordid><startdate>20241206</startdate><enddate>20241206</enddate><creator>Ye, Haotian</creator><creator>Wisiorek, Axel</creator><creator>Maronikolakis, Antonis</creator><creator>Alaçam, Özge</creator><creator>Schütze, Hinrich</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PIMPY</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20241206</creationdate><title>A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities</title><author>Ye, Haotian ; Wisiorek, Axel ; Maronikolakis, Antonis ; Alaçam, Özge ; Schütze, Hinrich</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-proquest_journals_31423737693</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Datasets</topic><topic>Federated learning</topic><topic>Hate speech</topic><topic>Internet</topic><topic>Languages</topic><topic>Privacy</topic><topic>Target detection</topic><toplevel>online_resources</toplevel><creatorcontrib>Ye, Haotian</creatorcontrib><creatorcontrib>Wisiorek, Axel</creatorcontrib><creatorcontrib>Maronikolakis, Antonis</creatorcontrib><creatorcontrib>Alaçam, Özge</creatorcontrib><creatorcontrib>Schütze, Hinrich</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science &amp; Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central UK/Ireland</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>ProQuest Engineering Database</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering Collection</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ye, Haotian</au><au>Wisiorek, Axel</au><au>Maronikolakis, Antonis</au><au>Alaçam, Özge</au><au>Schütze, Hinrich</au><format>book</format><genre>document</genre><ristype>GEN</ristype><atitle>A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities</atitle><jtitle>arXiv.org</jtitle><date>2024-12-06</date><risdate>2024</risdate><eissn>2331-8422</eissn><abstract>Hate speech online remains an understudied issue for marginalized communities, and has seen rising relevance, especially in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized communities living in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from hate speech on the internet by filtering offensive content in their native languages. Our contribution in this paper is twofold: 1) we release REACT (REsponsive hate speech datasets Across ConTexts), a collection of high-quality, culture-specific hate speech detection datasets comprising seven distinct target groups in eight low-resource languages, curated by experienced data collectors; 2) we propose a solution to few-shot hate speech detection utilizing federated learning (FL), a privacy-preserving and collaborative learning approach, to continuously improve a central model that exhibits robustness when tackling different target groups and languages. By keeping the training local to the users' devices, we ensure the privacy of the users' data while benefitting from the efficiency of federated learning. Furthermore, we personalize client models to target-specific training data and evaluate their performance. Our results indicate the effectiveness of FL across different target groups, whereas the benefits of personalization on few-shot learning are not clear.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier EISSN: 2331-8422
ispartof arXiv.org, 2024-12
issn 2331-8422
language eng
recordid cdi_proquest_journals_3142373769
source Publicly Available Content Database
subjects Datasets
Federated learning
Hate speech
Internet
Languages
Privacy
Target detection
title A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T18%3A23%3A17IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=document&rft.atitle=A%20Federated%20Approach%20to%20Few-Shot%20Hate%20Speech%20Detection%20for%20Marginalized%20Communities&rft.jtitle=arXiv.org&rft.au=Ye,%20Haotian&rft.date=2024-12-06&rft.eissn=2331-8422&rft_id=info:doi/&rft_dat=%3Cproquest%3E3142373769%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-proquest_journals_31423737693%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3142373769&rft_id=info:pmid/&rfr_iscdi=true