Loading…

Visual speech recognition: a solution from feature extraction to words classification

Audio-visual speech recognition has been an active area of research lately. A bit, and yet unsolved part of this problem is the visual only recognition, or lip reading. Considering an image sequence of a person pronouncing a word, a full image analysis solution would have to segment the mouth area,...

Full description

Saved in:
Bibliographic Details
Main Authors: Da Silveira, L.G., Facon, J., Borges, D.L.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Audio-visual speech recognition has been an active area of research lately. A bit, and yet unsolved part of this problem is the visual only recognition, or lip reading. Considering an image sequence of a person pronouncing a word, a full image analysis solution would have to segment the mouth area, extract relevant features, and use them to be able to classify the word from those visual features. We approach this problem by proposing a segmentation technique for the lips contours together with a set of features based on the extracted contours which is able to perform lip reading with promising results. We have collected visual speech sequences in our lab and show the results for a set of ten words in Brazilian Portuguese, spoken by different speakers in more than 150 samples. The approach can be extended and applied to other spoken languages as well.
ISSN:1530-1834
2377-5416
DOI:10.1109/SIBGRA.2003.1241036