Loading…
Twin identification from speech: linear and non-linear cepstral features and models
Identifying a speaker from the speech by using computer-based learning techniques is normally in vogue for many years in research perspective. Twin identification from visual features is a challenging work specifically to identify a twin from identical twin pairs because they look alike in exhibitin...
Saved in:
Published in: | International journal of speech technology 2020-03, Vol.23 (1), p.183-189 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Identifying a speaker from the speech by using computer-based learning techniques is normally in vogue for many years in research perspective. Twin identification from visual features is a challenging work specifically to identify a twin from identical twin pairs because they look alike in exhibiting manners, habits and so on. Speech can be used as a biometric to perform twin identification by imbibing adequate training imparted considering the fact that their vocal tract structure would be different in producing speech sounds. This work mainly highlights the usage of linear and non-linear frequency based Cepstral features and modelling methods to estimate the performance of the twin identification system. Set of speech utterances are separated into two sets namely training and testing. Utterances considered for training are concatenated and by performing the conventional pre-processing and appropriate feature extraction techniques, the proposed features are extracted. The feature vectors are normalized and given to the modeling techniques to create templates. Proposed features extracted from the utterances considered for testing are applied to the models and based on the classification criteria, twin is identifiedand the accuracy is categorized as sub-optimal and true success rates. Perceptual features with filters spaced in Equivalent Rectangular Bandwidth (ERB) for critical band provides better accuracy for the individual classification and decision level fusion classification. |
---|---|
ISSN: | 1381-2416 1572-8110 |
DOI: | 10.1007/s10772-020-09668-0 |