Loading…

Re-ranking spoken term detection with acoustic exemplars of keywords

Spoken term detection (STD) systems rank hypothesized detections by scores, which indicate how confident a hypothesized detection is a true instance of the keyword. Many STD systems rely on automatic speech recognition (ASR) to transcribe the speech content into the lattice representation. In such S...

Full description

Saved in:
Bibliographic Details
Published in:Speech communication 2018-11, Vol.104, p.12-23
Main Authors: Pham, Van Tung, Xu, Haihua, Xiao, Xiong, Chen, Nancy F., Chng, Eng Siong, Li, Haizhou
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Spoken term detection (STD) systems rank hypothesized detections by scores, which indicate how confident a hypothesized detection is a true instance of the keyword. Many STD systems rely on automatic speech recognition (ASR) to transcribe the speech content into the lattice representation. In such STD systems, the detection scores are usually estimated as the posterior probabilities of the keyword in the decoding lattices. Such scores may be inaccurate, e.g. due to the imperfect modeling of speech and noise. To improve the ranking of hypothesized detections, we propose to directly utilize the acoustic similarity scores between the speech signal of hypothesized detections and that of the keyword exemplars. A keyword exemplar is a true instance of the keyword obtained from an annotated speech corpus. When no exemplar is available, we propose to synthesize exemplars from the annotated speech corpus. Given the acoustic similarity between the hypothesized detections and keyword exemplars, two re-ranking methods are proposed, i.e. re-ranking by score fusion and re-ranking by similarity graph. Experimental results on the NIST OpenKWS14 and OpenKWS15 datasets show that the proposed re-ranking framework significantly outperforms the ranking based only on ASR confidence scores and also other re-ranking methods without using keyword exemplars.
ISSN:0167-6393
1872-7182
DOI:10.1016/j.specom.2018.09.004