Loading…

Multi-teacher knowledge distillation for compressed video action recognition based on deep learning

Recently, Convolutional Networks have great progress in classifying images. While action recognition is different from still image classification, video data contains temporal information which plays an important role in video understanding. Currently most CNN-based approaches for action recognition...

Full description

Saved in:
Bibliographic Details
Published in:Journal of systems architecture 2020-02, Vol.103, p.101695, Article 101695
Main Authors: Wu, Meng-Chieh, Chiu, Ching-Te
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Recently, Convolutional Networks have great progress in classifying images. While action recognition is different from still image classification, video data contains temporal information which plays an important role in video understanding. Currently most CNN-based approaches for action recognition has excessive computational costs, an explosion of parameters and computation time. The most efficient method currently trained a deep network directly on the compressed video contains the motion information. However, this method has a large number of parameters. We propose a multi-teacher knowledge distillation framework for compressed video action recognition to compress this model. With this framework, the model is compressed by transferring the knowledge from multiple teachers to a single small student model. With multi-teacher knowledge distillation, students learn better than single-teacher knowledge distillation. Experiments show that we can reach a 2.4 ×  compression rate in number of parameters and 1.2 ×  computation reduction with 1.79% loss of accuracy on the UCF-101 dataset and 0.35% loss of accuracy on the HMDB51 dataset.
ISSN:1383-7621
1873-6165
DOI:10.1016/j.sysarc.2019.101695