Loading…

Offensive language in user-generated comments in Lithuanian

The aim of the current research is to investigate the feasibility of identifying offensive language in Lithuanian by utilising the Simplified Offensive Language Taxonomy (SOLT). The key principle behind this taxonomy is its ability to complement existing offensive language ontologies and tagset syst...

Full description

Saved in:
Bibliographic Details
Published in:Lodz papers in pragmatics 2023-12, Vol.19 (2), p.239-254
Main Authors: Valūnaitė-Oleškevičienė, Giedrė, Selmistraitis, Linas, Utka, Andrius, Gudelis, Dangis
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The aim of the current research is to investigate the feasibility of identifying offensive language in Lithuanian by utilising the Simplified Offensive Language Taxonomy (SOLT). The key principle behind this taxonomy is its ability to complement existing offensive language ontologies and tagset systems, with the ultimate goal of integrating it into publicly accessible Linguistic Linked Open Data (LLOD) resources. The dataset used in the current study is a publicly available corpus of user-generated comments collected from a Lithuanian portal (Amilevičius et al. 2016). The study identified that offensive language predominantly focuses on collective derogatory language rather than individuals. The most common category of offensive language is related to physical and mental disabilities, followed by ideological offenses, xenophobic and sexist remarks, and less frequent categories like ageism, classism, homophobia, and religious discrimination. These results highlight the diverse range of offensive language online and underscore the need to combat discrimination and promote respectful discourse, particularly concerning marginalised groups.
ISSN:1895-6106
1898-4436
DOI:10.1515/lpp-2023-0013