Loading…
Automatic online text selection for constructing text corpus with custom phonetic distribution
Performance of Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) systems depends on an appropriate text corpus. This article explains about the automated text corpus generation method using custom phonetic distribution. This distribution is defined by phoneme types, corpus size, the minimu...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Performance of Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) systems depends on an appropriate text corpus. This article explains about the automated text corpus generation method using custom phonetic distribution. This distribution is defined by phoneme types, corpus size, the minimum criterion number of phonemes, and target phonetic distribution. Generally, the system selects text data from the Internet by continuously downloading them using a web crawler. The greedy algorithm is applied to extract the proper sentences, in order to fit with the target phonetic distribution until the appropriate text corpus is established. The experiment is done by using the text from the Large Vocabulary Continuous Speech Recognition (LVCSR) corpus for Thai language [1] to generate the target phonetic distribution. The result shows that the increased number of data drawn from the Internet is able to accomplish the target phonetic distribution and generates diphone coverage for 99.13%. This text corpus, then, can be used to generate the speech corpus efficiently. |
---|---|
DOI: | 10.1109/JCSSE.2012.6261916 |