Loading…

Saliency-driven unstructured acoustic scene classification using latent perceptual indexing

Automatic acoustic scene classification of real life, complex and unstructured acoustic scenes is a challenging task as the number of acoustic sources present in the audio stream are unknown and overlapping in time. In this work, we present a novel approach to classification such unstructured acoust...

Full description

Saved in:
Bibliographic Details
Main Authors: Kalinli, O., Sundaram, S., Narayanan, S.
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Automatic acoustic scene classification of real life, complex and unstructured acoustic scenes is a challenging task as the number of acoustic sources present in the audio stream are unknown and overlapping in time. In this work, we present a novel approach to classification such unstructured acoustic scenes. Motivated by the bottom-up attention model of the human auditory system, salient events of an audio clip are extracted in an unsupervised manner and presented to the classification system. Similar to latent semantic indexing of text documents, the classification system uses unit-document frequency measure to index the clip in a continuous, latent space. This allows for developing a completely class-independent approach to audio classification. Our results on the BBC sound effects library indicates that using the saliency-driven attention selection approach presented in this paper, 17.5% relative improvement can be obtained in frame-based classification and 25% relative improvement can be obtained using the latent audio indexing approach.
DOI:10.1109/MMSP.2009.5293267