Loading…

Using Language Models on Low-end Hardware

This paper evaluates the viability of using fixed language models for training text classification networks on low-end hardware. We combine language models with a CNN architecture and put together a comprehensive benchmark with 8 datasets covering single-label and multi-label classification of topic...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2023-05
Main Authors:	Ziegner, Fabian, Borst, Janos, Niekler, Andreas, Potthast, Martin
Format:	Article
Language:	English
Subjects:	Classification Hardware Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	This paper evaluates the viability of using fixed language models for training text classification networks on low-end hardware. We combine language models with a CNN architecture and put together a comprehensive benchmark with 8 datasets covering single-label and multi-label classification of topic, sentiment, and genre. Our observations are distilled into a list of trade-offs, concluding that there are scenarios, where not fine-tuning a language model yields competitive effectiveness at faster training, requiring only a quarter of the memory compared to fine-tuning.
ISSN:	2331-8422