Loading…
Human action recognition using short-time motion energy template images and PCANet features
Human action recognition has received significant attention because of its wide applications in human–machine interaction, visual surveillance, and video indexing. Recent progress in deep learning algorithms and convolutional neural networks (CNNs) significantly affects the performance of many actio...
Saved in:
Published in: | Neural computing & applications 2020-08, Vol.32 (16), p.12561-12574 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Human action recognition has received significant attention because of its wide applications in human–machine interaction, visual surveillance, and video indexing. Recent progress in deep learning algorithms and convolutional neural networks (CNNs) significantly affects the performance of many action recognition systems. However, CNNs are designed to learn features from 2D spatial images, while action videos are considered as 3D spatiotemporal signals. In addition, the complex structure of most deep networks and their dependency on backpropagation learning algorithm have a negative impact on the efficiency of real-time human action recognition systems. To avoid these limitations, we propose a new human action recognition method based on principal component analysis network (PCANet) which is a simple architecture of CNN exploiting unsupervised learning instead of the commonly used supervised algorithms. However, PCANet is originally designed to solve 2D image classification problems. In order to make it suitable for solving action recognition problem in videos, the temporal information of the input video is appropriately represented using motion energy templates. Multiple short-time motion energy image (ST-MEI) templates are computed to capture human motion information. The deep structure of PCANet helps to learn hierarchical local motion features from the input ST-MEI templates. The dimension of the feature vector learned from PCANet is reduced using whitening principal component analysis algorithm and finally fed to linear support vector machines classifier for classification. The proposed method is evaluated using three different benchmark datasets, namely KTH, Weizmann, and UCF sports action. Experimental results using leave-one-out strategy demonstrate the effectiveness of the proposed method compared with other state-of-the-art methods. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-020-04712-1 |