Loading…

CSTD-Telugu Corpus: Crowd-Sourced Approach for Large-Scale Speech data collection

Speech is a natural mode of communication among all beings. India is a densely populated country, and people are diverse throughout the globe. The spoken language is the medium of instruction to interact among the people. The majority of Indian languages are spoken globally. The unavailability of la...

Full description

Saved in:
Bibliographic Details
Main Authors: Mirishkar, Ganesh S, V, Vishnu Vidyadhara Raju, Naroju, Meher Dinesh, Maity, Sudhamay, Yalla, Prakash, Vuppala, Anil Kumar
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Speech is a natural mode of communication among all beings. India is a densely populated country, and people are diverse throughout the globe. The spoken language is the medium of instruction to interact among the people. The majority of Indian languages are spoken globally. The unavailability of larger volumes of transcribed and annotated speech data is often a hurdle for building reliable speech recognition (ASR) systems for Indian languages. Crowdsourcing strategies are effective in collaboratively collecting speech data resources. This paper describes the experience of large-scale speech data collection for the Telugu language through mobile and web-based applications. With this crowd contributed speech, the performance of the baseline ASR system is shown for clean speech. ASR performance for pink and white noises is also compared for various deep neural network (DNN) based acoustic models. The details regarding the usage of frameworks and their challenges during their implementation are part of this paper. The framework adopted for collecting the speech data is rapid, cost-saving, and offers the advantage of extending it to all the other Indian languages.
ISSN:2640-0103