Loading…

Improving Musical Tag Annotation with Stacking and Convolutional Neural Networks

Personalized music systems usually rely on manual song annotations (tags) as a mechanism for querying and navigating large music collections. However, the manual annotation is a hard task given the large amount of music available nowadays. Automatic song annotation based on content analysis is a pot...

Full description

Saved in:
Bibliographic Details
Main Authors: da Silva, Juliano Donini, da Costa, Yandre Maldonado Gomes, Domingues, Marcos Aurelio
Format: Conference Proceeding
Language:English
Subjects:
Online Access:Request full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Personalized music systems usually rely on manual song annotations (tags) as a mechanism for querying and navigating large music collections. However, the manual annotation is a hard task given the large amount of music available nowadays. Automatic song annotation based on content analysis is a potential solution to this problem and has recently been gaining attention. In this work, we propose to extend the Stacking prediction framework to use Convolutional Neural Networks (CNNs) in order to improve music tag annotation task. In general, the Stacked prediction consists of a technique in which the output of the first stage of learning is used as input in the second stage. In our work we have tried two proposals of extension. The first one consists of using the weights learned by the CNN in the first stage of training as input in the second stage. In the second proposal, we use the autoencoder technique with the weights learned by CNN in the first stage to generate images 50% smaller than the ones used as input of the first stage. The weights and the images obtained in the first stage are used as input in the second stage. We evaluated our proposals with five different CNN models in three datasets well-known in the literature (FMA, MillionSong, and MagnaTagATune), obtaining interesting results.
ISSN:2157-8702
DOI:10.1109/IWSSIP48289.2020.9145192