Loading…
Constructing a Classification Scheme - and its Consequences: A Field Study of Learning to Label Data for Computer Vision in a Hospital Intensive Care Unit
Research on data annotation for artificial intelligence (AI) has demonstrated that biases, power, and culture impact the ways that annotators apply labels to data and subsequently affect downstream AI systems. However, annotators can only apply labels that are available to them in the annotation cla...
Saved in:
Published in: | Proceedings of the ACM on human-computer interaction 2024-11, Vol.8 (CSCW2), p.1-29, Article 490 |
---|---|
Main Authors: | , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Research on data annotation for artificial intelligence (AI) has demonstrated that biases, power, and culture impact the ways that annotators apply labels to data and subsequently affect downstream AI systems. However, annotators can only apply labels that are available to them in the annotation classification scheme. Drawing on a 3-year ethnographic study of an R&D collaboration between medical and AI researchers, we argue that the construction of the classification schema itself -- decisions about what kinds of data can and cannot be collected, what activities can and cannot be detected in the data, what the possible annotation classes ought to be, and the rules by which an item ought to be classified into each class -- dramatically shape the annotation process, and through it, the AI. We draw on Bowker and Star's [9] classification theory to detail how the creation of a training data codebook for a computer vision algorithm in hospital intensive care units (ICUs) evolved from its original, clinically-driven goal of classifying complex clinical activities into a narrower goal of identifying physical objects and simpler activities in the ICU. This work reinforces how trade-offs and decisions made long before annotators begin labeling data are highly consequential to the resulting AI system. |
---|---|
ISSN: | 2573-0142 2573-0142 |
DOI: | 10.1145/3687029 |