Loading…

Resources and benchmark corpora for hate speech detection: a systematic review

Hate Speech in social media is a complex phenomenon, whose detection has recently gained significant traction in the Natural Language Processing community, as attested by several recent review works. Annotated corpora and benchmarks are key resources, considering the vast number of supervised approa...

Full description

Saved in:
Bibliographic Details
Published in:Language resources and evaluation 2021-06, Vol.55 (2), p.477-523
Main Authors: Poletto, Fabio, Basile, Valerio, Sanguinetti, Manuela, Bosco, Cristina, Patti, Viviana
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Hate Speech in social media is a complex phenomenon, whose detection has recently gained significant traction in the Natural Language Processing community, as attested by several recent review works. Annotated corpora and benchmarks are key resources, considering the vast number of supervised approaches that have been proposed. Lexica play an important role as well for the development of hate speech detection systems. In this review, we systematically analyze the resources made available by the community at large, including their development methodology, topical focus, language coverage, and other factors. The results of our analysis highlight a heterogeneous, growing landscape, marked by several issues and venues for improvement.
ISSN:1574-020X
1574-0218
DOI:10.1007/s10579-020-09502-8