Loading…

An Analysis of Human Judgements on Semantic Classification of Catalan Adjectives

This article reports on a large-scale experiment for gathering human judgements with respect to a semantic classification of Catalan adjectives. The goal of our experiment was to classify 210 Catalan adjectives as basic, event-related, or object-related adjectives, allowing for multiple class assign...

Full description

Saved in:
Bibliographic Details
Published in:Research on language and computation 2008-12, Vol.6 (3-4), p.247-271
Main Authors: Boleda, Gemma, Schulte im Walde, Sabine, Badia, Toni
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This article reports on a large-scale experiment for gathering human judgements with respect to a semantic classification of Catalan adjectives. The goal of our experiment was to classify 210 Catalan adjectives as basic, event-related, or object-related adjectives, allowing for multiple class assignments to account for polysemy. The experiment was directed at non-expert native speakers and administered via the Web, collecting data from 322 participants. We assess the degree of inter-annotator agreement through an innovative methodology based on observed agreement and kappa, and use weighted versions of these measures to account for partial agreement in polysemous assignments. Because the obtained scores (kappa 0.20–0.34) are too low to establish a reliably labelled dataset, we then perform a series of post-hoc analyses on the human judgements to investigate the sources of disagreement, by comparing the participants’ classifications with a classification obtained from experts. Our analysis shows that polysemous items and event-related adjectives are more problematic than other types of adjectives. Furthermore, the analysis helps to distinguish disagreement caused by the task as opposed to that caused by the experimental design, thus pointing to specific difficulties in both aspects of the research. The methodology developed for this analysis might therefore prove useful for the design of experiments for related tasks.
ISSN:1570-7075
1572-8706
DOI:10.1007/s11168-008-9056-4