Loading…
Speaker diarization method of telemarketer and client for improving speech dictation performance
Financial institutions employ speech dictation systems that convert the conversation recordings between telemarketer and client into the texts. The dictation system is necessary for checking incomplete sales, in which a telemarketer fails to provide important sales information to a client. However,...
Saved in:
Published in: | The Journal of supercomputing 2016-05, Vol.72 (5), p.1757-1769 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Financial institutions employ speech dictation systems that convert the conversation recordings between telemarketer and client into the texts. The dictation system is necessary for checking incomplete sales, in which a telemarketer fails to provide important sales information to a client. However, the manually performed dictation procedure takes too much time and effort. Automatic speech dictation system is being adopted as an alternative. We suggest that, in such an automatic speech dictation system, a speaker diarization is performed prior to speech recognition. In this paper, we propose a diarization method based on pitch detection, which suits very well to given condition in which two speakers, telemarketer and client, make a conversation in a telephone recording. We suggest a method based on average short time spectral feature and unsupervised learning scheme. In the experiments, actual telephone recordings for insurance contraction were used. We obtained on average about 6Â % of Diarization Error Rate (DER). |
---|---|
ISSN: | 0920-8542 1573-0484 |
DOI: | 10.1007/s11227-015-1470-4 |