Loading…

Back to the Future: On Potential Histories in NLP

Machine learning and NLP require the construction of datasets to train and fine-tune models. In this context, previous work has demonstrated the sensitivity of these data sets. For instance, potential societal biases in this data are likely to be encoded and to be amplified in the models we deploy....

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2022-10
Main Authors:	Zeerak Talat, Lauscher, Anne
Format:	Article
Language:	English
Subjects:	Datasets Machine learning
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Machine learning and NLP require the construction of datasets to train and fine-tune models. In this context, previous work has demonstrated the sensitivity of these data sets. For instance, potential societal biases in this data are likely to be encoded and to be amplified in the models we deploy. In this work, we draw from developments in the field of history and take a novel perspective on these problems: considering datasets and models through the lens of historical fiction surfaces their political nature, and affords re-configuring how we view the past, such that marginalized discourses are surfaced. Building on such insights, we argue that contemporary methods for machine learning are prejudiced towards dominant and hegemonic histories. Employing the example of neopronouns, we show that by surfacing marginalized histories within contemporary conditions, we can create models that better represent the lived realities of traditionally marginalized and excluded communities.
ISSN:	2331-8422