Loading…

Class-GE2E: Speaker Verification Using Self-Attention and Transfer Learning with Loss Combination

Recent studies prove that speaker verification performance improves by employing an attention mechanism compared to using temporal and statistical pooling techniques. This paper proposes an advanced multi-head attention method, which utilizes a sorted vector of the frame-level features to consider a...

Full description

Saved in:

Bibliographic Details
Published in:	Electronics (Basel) 2022-03, Vol.11 (6), p.893
Main Authors:	Bae, Ara, Kim, Wooil
Format:	Article
Language:	English
Subjects:	Entropy Learning Methods Neural networks Verification
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Recent studies prove that speaker verification performance improves by employing an attention mechanism compared to using temporal and statistical pooling techniques. This paper proposes an advanced multi-head attention method, which utilizes a sorted vector of the frame-level features to consider a higher correlation. In this study, we also propose a transfer learning scheme to maximize the effectiveness of the two loss functions, which are the classifier-based cross entropy loss function and metric-based GE2E loss function, to learn the distance between embeddings. The sorted multi-head attention (SMHA) method outperforms the conventional attention methods showing 4.55% in equal error rate (EER). The proposed transfer learning scheme with Class-GE2E loss function significantly improved our attention-based systems. In particular, the EER of the SMHA decreased to 4.39% by employing transfer learning with Class-GE2E loss. The experimental results demonstrate that our effort to include a greater correlation between frame-level features for multi-head attention processing, and the combining of two different loss functions through transfer learning, is highly effective for improving speaker verification performance.
ISSN:	2079-9292 2079-9292
DOI:	10.3390/electronics11060893