Loading…

Data collection and evaluation of AURORA-2 Japanese corpus [speech recognition applications]

Speech recognition systems must still be improved when they are exposed to noisy environments. For this improvement, developments of the standard evaluation corpus and assessment technologies are essential. Recently, the AURORA-2,3 corpus and their evaluation scenarios have had significant impact on...

Full description

Saved in:
Bibliographic Details
Main Authors: Nakamura, S., Yamamoto, K., Takeda, K., Kuroiwa, S., Kitaoka, N., Yamada, T., Mizumachi, M., Nishiura, T., Fujimoto, M., Saso, A., Endo, T.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Speech recognition systems must still be improved when they are exposed to noisy environments. For this improvement, developments of the standard evaluation corpus and assessment technologies are essential. Recently, the AURORA-2,3 corpus and their evaluation scenarios have had significant impact on noisy speech recognition research. This paper introduces a Japanese noisy speech corpus and its evaluation scripts, called AURORA-2J The AURORA-2J is a Japanese connected digits corpus. The data collection and evaluation scenarios are designed in the same way as AURORA-2 with the help of the ETSI AURORA group. Furthermore, we have collected an in-car speech corpus similar to AURORA-3. The in-car speech corpus includes Japanese connected digits and command words collected in a moving car. This paper describes the data collection, baseline scripts, and its baseline performance.
DOI:10.1109/ASRU.2003.1318511