Loading…

Understanding dance semantics using spatio-temporal features coupled GRU networks

[Display omitted] •Designed a framework for understanding dance semantics from live videos which can be used for real-time annotation.•Novel method is proposed to handle the spatio-temporal dynamics of the dance using key-points coupled with GRU networks.•A key point normalization method is proposed...

Full description

Saved in:
Bibliographic Details
Published in:Entertainment computing 2022-05, Vol.42, p.100484, Article 100484
Main Authors: S., Shailesh, M.V., Judy
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:[Display omitted] •Designed a framework for understanding dance semantics from live videos which can be used for real-time annotation.•Novel method is proposed to handle the spatio-temporal dynamics of the dance using key-points coupled with GRU networks.•A key point normalization method is proposed to handle spatial dependency and translation variance.•Created a dance video repository by recording live performances of dancers and videos from the internet. The efforts are taken for cultural heritage preservation lead to many digitization initiatives. Indian Classical Dance and its tradition has proven historical importance. Computer-aided archiving and preservation of Indian classical dance resources open enormous opportunities for computational analysis. With the help of recent computational advancements in the field of Computer Vision, these archives can be transformed into intelligent information retrieval systems. In this work, we propose a novel method for understanding the dance semantics by making use of the Spatio-temporal variations of dance features. A video archive is created as a part of this work, from live recordings of different trained dancers and clippings from the internet. The Deep Pose Estimator coupled GRU Model deals with the spatial aspects with a deep learning pose estimator and handles the temporal perspective with GRU Network. The efficiency of the proposed method was compared with benchmark methods such as A 3D-Convolutional Neural Network-based Model, Time Distributed CNN-LSTM Model, and Hybrid Transfer Learning - LSTM Model and the results show the proposed method outperformed others even with different video resolutions.
ISSN:1875-9521
1875-953X
DOI:10.1016/j.entcom.2022.100484