Loading…

Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets

Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their per...

Full description

Saved in:
Bibliographic Details
Published in:ACM transactions on Asian and low-resource language information processing 2024-06
Main Authors: Daouadi, Kheir Eddine, Boualleg, Yaakoub, Guehairia, Oussama
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3
container_end_page
container_issue
container_start_page
container_title ACM transactions on Asian and low-resource language information processing
container_volume
creator Daouadi, Kheir Eddine
Boualleg, Yaakoub
Guehairia, Oussama
description Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their performance rate is still limited. We address the hate speech identification from Arabic tweets while providing a deeper comprehension of the capability of a new technique based on transfer learning. Specifically, the accuracy result of traditional machine learning (ML) models is compared with Pre-trained Language Models (PLMs) as well as Deep Learning (DL) models. Experiments on a benchmark dataset show that (1) the multidialectal PLMs outperform monolingual and multilingual ones; (2) the fine-tuning of recent PLMs enhances the performance results of hate speech classification from Arabic tweets. The major contribution of this work lies in achieving promising accuracy results in the Arabic hate speech classification task.
doi_str_mv 10.1145/3674970
format article
fullrecord <record><control><sourceid>acm_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3674970</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3674970</sourcerecordid><originalsourceid>FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3</originalsourceid><addsrcrecordid>eNo9kEtPwzAQhC0EElWpuHPyjVNgHTt2c6zKo5WCQLT3aOOs26A2qWwD6r8n0JbT7Gi_mcMwdi3gTgiV3UttVG7gjA1SabJEGUjPT7fO80s2CuEDAIQyWoMYsPViHyJtMTaWz9svCrFZ9aZreef4O1lqI3_zlESPTUs1L7BdfeKK-EtX04a7zvMZRuKLHZFd8weKZP_iTcsnHqu-dvlNFMMVu3C4CTQ66pAtnx6X01lSvD7Pp5MiwbGCREJVS42pACtMTlVlcykdEpIUjmyNMqvGFlISWU0aUMtcWafIpNQHa5JDdnuotb4LwZMrd77Zot-XAsrficrjRD15cyDRbv-h0_MH7q1iEw</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Daouadi, Kheir Eddine ; Boualleg, Yaakoub ; Guehairia, Oussama</creator><creatorcontrib>Daouadi, Kheir Eddine ; Boualleg, Yaakoub ; Guehairia, Oussama</creatorcontrib><description>Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their performance rate is still limited. We address the hate speech identification from Arabic tweets while providing a deeper comprehension of the capability of a new technique based on transfer learning. Specifically, the accuracy result of traditional machine learning (ML) models is compared with Pre-trained Language Models (PLMs) as well as Deep Learning (DL) models. Experiments on a benchmark dataset show that (1) the multidialectal PLMs outperform monolingual and multilingual ones; (2) the fine-tuning of recent PLMs enhances the performance results of hate speech classification from Arabic tweets. The major contribution of this work lies in achieving promising accuracy results in the Arabic hate speech classification task.</description><identifier>ISSN: 2375-4699</identifier><identifier>EISSN: 2375-4702</identifier><identifier>DOI: 10.1145/3674970</identifier><language>eng</language><publisher>New York, NY: ACM</publisher><subject>Do Not Use This Code, Generate the Correct Terms for Your Paper</subject><ispartof>ACM transactions on Asian and low-resource language information processing, 2024-06</ispartof><rights>Copyright held by the owner/author(s). Publication rights licensed to ACM.</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Daouadi, Kheir Eddine</creatorcontrib><creatorcontrib>Boualleg, Yaakoub</creatorcontrib><creatorcontrib>Guehairia, Oussama</creatorcontrib><title>Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets</title><title>ACM transactions on Asian and low-resource language information processing</title><addtitle>ACM TALLIP</addtitle><description>Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their performance rate is still limited. We address the hate speech identification from Arabic tweets while providing a deeper comprehension of the capability of a new technique based on transfer learning. Specifically, the accuracy result of traditional machine learning (ML) models is compared with Pre-trained Language Models (PLMs) as well as Deep Learning (DL) models. Experiments on a benchmark dataset show that (1) the multidialectal PLMs outperform monolingual and multilingual ones; (2) the fine-tuning of recent PLMs enhances the performance results of hate speech classification from Arabic tweets. The major contribution of this work lies in achieving promising accuracy results in the Arabic hate speech classification task.</description><subject>Do Not Use This Code, Generate the Correct Terms for Your Paper</subject><issn>2375-4699</issn><issn>2375-4702</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNo9kEtPwzAQhC0EElWpuHPyjVNgHTt2c6zKo5WCQLT3aOOs26A2qWwD6r8n0JbT7Gi_mcMwdi3gTgiV3UttVG7gjA1SabJEGUjPT7fO80s2CuEDAIQyWoMYsPViHyJtMTaWz9svCrFZ9aZreef4O1lqI3_zlESPTUs1L7BdfeKK-EtX04a7zvMZRuKLHZFd8weKZP_iTcsnHqu-dvlNFMMVu3C4CTQ66pAtnx6X01lSvD7Pp5MiwbGCREJVS42pACtMTlVlcykdEpIUjmyNMqvGFlISWU0aUMtcWafIpNQHa5JDdnuotb4LwZMrd77Zot-XAsrficrjRD15cyDRbv-h0_MH7q1iEw</recordid><startdate>20240625</startdate><enddate>20240625</enddate><creator>Daouadi, Kheir Eddine</creator><creator>Boualleg, Yaakoub</creator><creator>Guehairia, Oussama</creator><general>ACM</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20240625</creationdate><title>Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets</title><author>Daouadi, Kheir Eddine ; Boualleg, Yaakoub ; Guehairia, Oussama</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Do Not Use This Code, Generate the Correct Terms for Your Paper</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Daouadi, Kheir Eddine</creatorcontrib><creatorcontrib>Boualleg, Yaakoub</creatorcontrib><creatorcontrib>Guehairia, Oussama</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on Asian and low-resource language information processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Daouadi, Kheir Eddine</au><au>Boualleg, Yaakoub</au><au>Guehairia, Oussama</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets</atitle><jtitle>ACM transactions on Asian and low-resource language information processing</jtitle><stitle>ACM TALLIP</stitle><date>2024-06-25</date><risdate>2024</risdate><issn>2375-4699</issn><eissn>2375-4702</eissn><abstract>Today, hate speech classification from Arabic tweets has gained significant interest among global researchers. Different techniques and systems are harnessed to overcome this classification task. However, two main challenges are confronted, the use of handcrafted features and the fact that their performance rate is still limited. We address the hate speech identification from Arabic tweets while providing a deeper comprehension of the capability of a new technique based on transfer learning. Specifically, the accuracy result of traditional machine learning (ML) models is compared with Pre-trained Language Models (PLMs) as well as Deep Learning (DL) models. Experiments on a benchmark dataset show that (1) the multidialectal PLMs outperform monolingual and multilingual ones; (2) the fine-tuning of recent PLMs enhances the performance results of hate speech classification from Arabic tweets. The major contribution of this work lies in achieving promising accuracy results in the Arabic hate speech classification task.</abstract><cop>New York, NY</cop><pub>ACM</pub><doi>10.1145/3674970</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2375-4699
ispartof ACM transactions on Asian and low-resource language information processing, 2024-06
issn 2375-4699
2375-4702
language eng
recordid cdi_crossref_primary_10_1145_3674970
source Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
subjects Do Not Use This Code, Generate the Correct Terms for Your Paper
title Systematic Investigation of Recent Pre-trained Language Model for Hate Speech Detection in Arabic Tweets
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-08T05%3A01%3A42IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-acm_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Systematic%20Investigation%20of%20Recent%20Pre-trained%20Language%20Model%20for%20Hate%20Speech%20Detection%20in%20Arabic%20Tweets&rft.jtitle=ACM%20transactions%20on%20Asian%20and%20low-resource%20language%20information%20processing&rft.au=Daouadi,%20Kheir%20Eddine&rft.date=2024-06-25&rft.issn=2375-4699&rft.eissn=2375-4702&rft_id=info:doi/10.1145/3674970&rft_dat=%3Cacm_cross%3E3674970%3C/acm_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a840-30bd36a210c179ebbc933faeae31fecda35b8c02e15de60a6394cf4e72ebd3de3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true