Loading…

A Study of Mispronunciation Detection and Diagnosis Based on Meta-Learning

The majority of the current mispronunciation detection and diagnosis (MD&D) methods rely on manually annotated data for model training. However, annotating mispronunciations produced by second language (L2) learners is costly. Consequently, data scarcity emerges as a significant challenge in MD&...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wan, Yukai, Shi, Yuqi, Lin, Binghuai, Xie, Yanlu
Format:	Conference Proceeding
Language:	English
Subjects:	Acoustics Adaptation models Data models fast adaptation Metalearning mispronunciation detection and diagnosis model-agnostic meta-learning second language learner Signal processing Task analysis Training
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The majority of the current mispronunciation detection and diagnosis (MD&D) methods rely on manually annotated data for model training. However, annotating mispronunciations produced by second language (L2) learners is costly. Consequently, data scarcity emerges as a significant challenge in MD&D tasks. In this paper, we employ model-agnostic meta-learning (MAML) to train a phoneme recognition model for MD&D. We conduct experiments using varied meta-learning task partitioning and training strategies to endow the model's ability to rapidly adapt to unfamiliar speakers. Our best-performing method achieves an F-measure of 61.45%, surpassing both the method using fine-tuned pre-trained model wav2vec2.0 and the approach of incorporating reference text during training. These related works also aim to address the challenge of data scarcity in MD&D. Notably, with few-shot fine-tuning, our model still yielded some remarkable results on F-measure, which suggest that in MD&D tasks, meta-learning is indeed effective.
ISSN:	2379-190X
DOI:	10.1109/ICASSP48485.2024.10447007