Loading…

On-Device Personalization of Automatic Speech Recognition Models for Disordered Speech

While current state-of-the-art Automatic Speech Recognition (ASR) systems achieve high accuracy on typical speech, they suffer from significant performance degradation on disordered speech and other atypical speech patterns. Personalization of ASR models, a commonly applied solution to this problem,...

Full description

Saved in:

Bibliographic Details
Published in:	arXiv.org 2021-06
Main Authors:	Tomanek, Katrin, Beaufays, Françoise, Cattiau, Julie, Chandorkar, Angad, Khe Chai Sim
Format:	Article
Language:	English
Subjects:	Automatic speech recognition Copying Customization Electronic devices Performance degradation Speech Voice communication Voice control Voice recognition
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	While current state-of-the-art Automatic Speech Recognition (ASR) systems achieve high accuracy on typical speech, they suffer from significant performance degradation on disordered speech and other atypical speech patterns. Personalization of ASR models, a commonly applied solution to this problem, is usually performed in a server-based training environment posing problems around data privacy, delayed model-update times, and communication cost for copying data and models between mobile device and server infrastructure. In this paper, we present an approach to on-device based ASR personalization with very small amounts of speaker-specific data. We test our approach on a diverse set of 100 speakers with disordered speech and find median relative word error rate improvement of 71% with only 50 short utterances required per speaker. When tested on a voice-controlled home automation platform, on-device personalized models show a median task success rate of 81%, compared to only 40% of the unadapted models.
ISSN:	2331-8422