Loading…
GAN acoustic model for Kazakh speech synthesis
Recent studies on the application of generative adversarial networks (GAN) for speech synthesis have shown improvements in the naturalness of synthesized speech, compared to the conventional approaches. In this article, we present a new framework of GAN to train an acoustic model for speech synthesi...
Saved in:
Published in: | International journal of speech technology 2021-09, Vol.24 (3), p.729-735 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Recent studies on the application of generative adversarial networks (GAN) for speech synthesis have shown improvements in the naturalness of synthesized speech, compared to the conventional approaches. In this article, we present a new framework of GAN to train an acoustic model for speech synthesis. The proposed GAN consists of a generator and a pair of agent discriminators, where the generator produces acoustic parameters taking into account linguistic parameters; and the pair of agent discriminators are introduced to improve the naturalness of the synthesized speech. We feed the agents with acoustic and linguistic parameters, thereby the agents do not only examine the acoustic distribution, but also the relationship between linguistic and acoustic parameters. Training and testing were conducted on the Kazakh speech corpus. According to the results of this research, the proposed framework of GAN improves the accuracy of the acoustic model for the Kazakh text-to-speech system. |
---|---|
ISSN: | 1381-2416 1572-8110 |
DOI: | 10.1007/s10772-021-09840-0 |