Loading…

A Study of Mispronunciation Detection and Diagnosis Based on Meta-Learning

The majority of the current mispronunciation detection and diagnosis (MD&D) methods rely on manually annotated data for model training. However, annotating mispronunciations produced by second language (L2) learners is costly. Consequently, data scarcity emerges as a significant challenge in MD&...

Full description

Saved in:
Bibliographic Details
Main Authors: Wan, Yukai, Shi, Yuqi, Lin, Binghuai, Xie, Yanlu
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The majority of the current mispronunciation detection and diagnosis (MD&D) methods rely on manually annotated data for model training. However, annotating mispronunciations produced by second language (L2) learners is costly. Consequently, data scarcity emerges as a significant challenge in MD&D tasks. In this paper, we employ model-agnostic meta-learning (MAML) to train a phoneme recognition model for MD&D. We conduct experiments using varied meta-learning task partitioning and training strategies to endow the model's ability to rapidly adapt to unfamiliar speakers. Our best-performing method achieves an F-measure of 61.45%, surpassing both the method using fine-tuned pre-trained model wav2vec2.0 and the approach of incorporating reference text during training. These related works also aim to address the challenge of data scarcity in MD&D. Notably, with few-shot fine-tuning, our model still yielded some remarkable results on F-measure, which suggest that in MD&D tasks, meta-learning is indeed effective.
ISSN:2379-190X
DOI:10.1109/ICASSP48485.2024.10447007