Loading…

Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets

Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their per...

Full description

Saved in:

Bibliographic Details
Published in:	ACM transactions on Asian and low-resource language information processing 2024-06
Main Authors:	Daouadi, Kheir Eddine, Boualleg, Yaakoub, Guehairia, Oussama
Format:	Article
Language:	English
Subjects:	Do Not Use This Code, Generate the Correct Terms for Your Paper
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3
container_end_page
container_issue
container_start_page
container_title	ACM transactions on Asian and low-resource language information processing
container_volume
creator	Daouadi, Kheir Eddine Boualleg, Yaakoub Guehairia, Oussama
description	Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their performance rate is still limited. We address the hate speech identification from Arabic tweets while providing a deeper comprehension of the capability of a new technique based on transfer learning. Specifically, the accuracy result of traditional machine learning (ML) models is compared with Pre-trained Language Models (PLMs) as well as Deep Learning (DL) models. Experiments on a benchmark dataset show that (1) the multidialectal PLMs outperform monolingual and multilingual ones; (2) the fine-tuning of recent PLMs enhances the performance results of hate speech classification from Arabic tweets. The major contribution of this work lies in achieving promising accuracy results in the Arabic hate speech classification task.
doi_str_mv	10.1145/3674970
format	article
fullrecord	<record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3674970</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3674970</sourcerecordid><originalsourceid>FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3</originalsourceid><addsrcrecordid>eNo9kEtPwzAQhC0EElWpuHPyjVNgHTt2c6zKo5WCQLT3aOOs26A2qWwD6r8n0JbT7Gi_mcMwdi3gTgiV3UttVG7gjA1SabJEGUjPT7fO80s2CuEDAIQyWoMYsPViHyJtMTaWz9svCrFZ9aZreef4O1lqI3_zlESPTUs1L7BdfeKK-EtX04a7zvMZRuKLHZFd8weKZP_iTcsnHqu-dvlNFMMVu3C4CTQ66pAtnx6X01lSvD7Pp5MiwbGCREJVS42pACtMTlVlcykdEpIUjmyNMqvGFlISWU0aUMtcWafIpNQHa5JDdnuotb4LwZMrd77Zot-XAsrficrjRD15cyDRbv-h0_MH7q1iEw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Daouadi, Kheir Eddine ; Boualleg, Yaakoub ; Guehairia, Oussama</creator><creatorcontrib>Daouadi, Kheir Eddine ; Boualleg, Yaakoub ; Guehairia, Oussama</creatorcontrib><description>Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their performance rate is still limited. We address the hate speech identification from Arabic tweets while providing a deeper comprehension of the capability of a new technique based on transfer learning. Specifically, the accuracy result of traditional machine learning (ML) models is compared with Pre-trained Language Models (PLMs) as well as Deep Learning (DL) models. Experiments on a benchmark dataset show that (1) the multidialectal PLMs outperform monolingual and multilingual ones; (2) the fine-tuning of recent PLMs enhances the performance results of hate speech classification from Arabic tweets. The major contribution of this work lies in achieving promising accuracy results in the Arabic hate speech classification task.</description><identifier>ISSN: 2375-4699</identifier><identifier>EISSN: 2375-4702</identifier><identifier>DOI: 10.1145/3674970</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Do Not Use This Code, Generate the Correct Terms for Your Paper</subject><ispartof>ACM transactions on Asian and low-resource language information processing, 2024-06</ispartof><rights>Copyright held by the owner/author(s). Publication rights licensed to ACM.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Daouadi, Kheir Eddine</creatorcontrib><creatorcontrib>Boualleg, Yaakoub</creatorcontrib><creatorcontrib>Guehairia, Oussama</creatorcontrib><title>Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets</title><title>ACM transactions on Asian and low-resource language information processing</title><addtitle>ACM TALLIP</addtitle><description>Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their performance rate is still limited. We address the hate speech identification from Arabic tweets while providing a deeper comprehension of the capability of a new technique based on transfer learning. Specifically, the accuracy result of traditional machine learning (ML) models is compared with Pre-trained Language Models (PLMs) as well as Deep Learning (DL) models. Experiments on a benchmark dataset show that (1) the multidialectal PLMs outperform monolingual and multilingual ones; (2) the fine-tuning of recent PLMs enhances the performance results of hate speech classification from Arabic tweets. The major contribution of this work lies in achieving promising accuracy results in the Arabic hate speech classification task.</description><subject>Do Not Use This Code, Generate the Correct Terms for Your Paper</subject><issn>2375-4699</issn><issn>2375-4702</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9kEtPwzAQhC0EElWpuHPyjVNgHTt2c6zKo5WCQLT3aOOs26A2qWwD6r8n0JbT7Gi_mcMwdi3gTgiV3UttVG7gjA1SabJEGUjPT7fO80s2CuEDAIQyWoMYsPViHyJtMTaWz9svCrFZ9aZreef4O1lqI3_zlESPTUs1L7BdfeKK-EtX04a7zvMZRuKLHZFd8weKZP_iTcsnHqu-dvlNFMMVu3C4CTQ66pAtnx6X01lSvD7Pp5MiwbGCREJVS42pACtMTlVlcykdEpIUjmyNMqvGFlISWU0aUMtcWafIpNQHa5JDdnuotb4LwZMrd77Zot-XAsrficrjRD15cyDRbv-h0_MH7q1iEw</recordid><startdate>20240625</startdate><enddate>20240625</enddate><creator>Daouadi, Kheir Eddine</creator><creator>Boualleg, Yaakoub</creator><creator>Guehairia, Oussama</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20240625</creationdate><title>Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets</title><author>Daouadi, Kheir Eddine ; Boualleg, Yaakoub ; Guehairia, Oussama</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Do Not Use This Code, Generate the Correct Terms for Your Paper</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Daouadi, Kheir Eddine</creatorcontrib><creatorcontrib>Boualleg, Yaakoub</creatorcontrib><creatorcontrib>Guehairia, Oussama</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on Asian and low-resource language information processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Daouadi, Kheir Eddine</au><au>Boualleg, Yaakoub</au><au>Guehairia, Oussama</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets</atitle><jtitle>ACM transactions on Asian and low-resource language information processing</jtitle><stitle>ACM TALLIP</stitle><date>2024-06-25</date><risdate>2024</risdate><issn>2375-4699</issn><eissn>2375-4702</eissn><abstract>Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their performance rate is still limited. We address the hate speech identification from Arabic tweets while providing a deeper comprehension of the capability of a new technique based on transfer learning. Specifically, the accuracy result of traditional machine learning (ML) models is compared with Pre-trained Language Models (PLMs) as well as Deep Learning (DL) models. Experiments on a benchmark dataset show that (1) the multidialectal PLMs outperform monolingual and multilingual ones; (2) the fine-tuning of recent PLMs enhances the performance results of hate speech classification from Arabic tweets. The major contribution of this work lies in achieving promising accuracy results in the Arabic hate speech classification task.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3674970</doi><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 2375-4699
ispartof	ACM transactions on Asian and low-resource language information processing, 2024-06
issn	2375-4699 2375-4702
language	eng
recordid	cdi_crossref_primary_10_1145_3674970
source	Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
subjects	Do Not Use This Code, Generate the Correct Terms for Your Paper
title	Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T05%3A01%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Systematic%20Investigation%20of%20Recent%20Pre-trained%20Language%20Model%20for%20Hate%20Speech%20Detection%20in%20Arabic%20Tweets&rft.jtitle=ACM%20transactions%20on%20Asian%20and%20low-resource%20language%20information%20processing&rft.au=Daouadi,%20Kheir%20Eddine&rft.date=2024-06-25&rft.issn=2375-4699&rft.eissn=2375-4702&rft_id=info:doi/10.1145/3674970&rft_dat=%3Cacm_cross%3E3674970%3C/acm_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true