Loading…

Multi-Sample Subband Wavernn Via Multivariate Gaussian

This paper proposes a high-speed neural vocoder for CPU implementation. Two approaches for speeding up autoregressive neural vocoders have been proposed, 1) simultaneous multiple sample generation and 2) subband signal-based vocoder; so far they have been employed independently. Our neural vocoder i...

Full description

Saved in:
Bibliographic Details
Main Authors: Kanagawa, Hiroki, Ijima, Yusuke
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposes a high-speed neural vocoder for CPU implementation. Two approaches for speeding up autoregressive neural vocoders have been proposed, 1) simultaneous multiple sample generation and 2) subband signal-based vocoder; so far they have been employed independently. Our neural vocoder is extremely fast as it generates multiple samples of subband signals simultaneously. Although there is an association between each subband signal, the conventional subband-based vocoder can degrade quality because each subband signal is generated from an independent probability distribution. To overcome this problem, we also introduce waveform generation that takes account of the association of each subband by employing multivariate Gaussian. Experiments show that 1) our proposed method is 1.81 times as fast as the conventional subband WaveRNN on a single-threaded CPU; 2) it outperformed the conventional method in a subjective evaluation in terms of naturalness, and achieved a mean opinion score (MOS) of 4.08 on text-to-speech task.
ISSN:2379-190X
DOI:10.1109/ICASSP43922.2022.9747898