Loading…

NECOS: An annotated corpus to identify constructive news comments in Spanish

In this paper, we present the NEws and COmments in Spanish (NECOS) corpus, a collection of Spanish comments posted in response to newspaper articles. Following a robust annotation scheme, three annotators labeled the comments as constructive and non-constructive. The articles were published in the n...

Full description

Saved in:
Bibliographic Details
Published in:Procesamiento del Lenguaje Natural 2021-03, Vol.66, p.41
Main Authors: López-Úbeda, Pilar, Plaza-del-Arco, Flor Miriam, Díaz-Galiano, Manuel Carlos, Martín-Valdivia, M Teresa
Format: Article
Language:English
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, we present the NEws and COmments in Spanish (NECOS) corpus, a collection of Spanish comments posted in response to newspaper articles. Following a robust annotation scheme, three annotators labeled the comments as constructive and non-constructive. The articles were published in the newspaper El Mundo between April 3rd and April 30th, 2018. The corpus is composed of a total of 10 news articles and 1,419 comments. Three annotators manually labeled NECOS with an average Cohen's kappa of 78.97. Our current focus is the study of constructiveness and the evaluation of the Spanish NECOS corpus. In order to address this goal, we propose a benchmark testing different machine learning systems based on Natural Language Processing: a traditional system and the novel Transformer-based models. Specifically, we compare multilingual models with a monolingual model trained on Spanish in order to highlight the need to create resources trained on a specific language. The monolingual model fine-tuning on NECOS obtain the best result by achieving a macro-average F1 score of 77.24%.
ISSN:1135-5948
1989-7553
DOI:10.26342/2021-66-3