Loading…

Development of context dependent sequential K-nearest neighbor classifier for usable speech classification

The accuracy of speech processing applications degrades when operating in co-channel environment. Co-channel speech occurs when more than one person is talking at the same time. The idea of usable speech segmentation is to identify and extract those portions of co-channel speech that are minimally d...

Full description

Saved in:

Bibliographic Details
Published in:	The Journal of the Acoustical Society of America 2004-05, Vol.115 (5_Supplement), p.2427-2427
Main Authors:	Shah, J. K., Iyer, A. N., Smolenski, B. Y., Yantorno, R. E.
Format:	Article
Language:	English
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The accuracy of speech processing applications degrades when operating in co-channel environment. Co-channel speech occurs when more than one person is talking at the same time. The idea of usable speech segmentation is to identify and extract those portions of co-channel speech that are minimally degraded but still useful for speech processing applications (such as speaker identification or speech recognition) which do not work in co-channel environments. Usable speech measures are features that are extracted from the co-channel signal to distinguish between usable and unusable speech. Several usable speech extraction methods have recently been developed based on a single feature of the speech signal being considered. In this paper, however, a new usable speech extraction technique, which sequentially and contextually selects several features of the given signal using the K-nearest neighbor classifier, is being investigated. This new approach considers periodicity and structure based features simultaneously in order to achieve the maximum classification rate, and by observing all the incoming frames, avoids the problem of deciding the amount of data needed to make accurate decisions. A 100% accuracy can be achieved in speech processing applications by using this extracted usable speech segment.
ISSN:	0001-4966 1520-8524
DOI:	10.1121/1.1669313