Loading…
Multi-teacher knowledge distillation for compressed video action recognition based on deep learning
Recently, Convolutional Networks have great progress in classifying images. While action recognition is different from still image classification, video data contains temporal information which plays an important role in video understanding. Currently most CNN-based approaches for action recognition...
Saved in:
Published in: | Journal of systems architecture 2020-02, Vol.103, p.101695, Article 101695 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Recently, Convolutional Networks have great progress in classifying images. While action recognition is different from still image classification, video data contains temporal information which plays an important role in video understanding. Currently most CNN-based approaches for action recognition has excessive computational costs, an explosion of parameters and computation time. The most efficient method currently trained a deep network directly on the compressed video contains the motion information. However, this method has a large number of parameters. We propose a multi-teacher knowledge distillation framework for compressed video action recognition to compress this model. With this framework, the model is compressed by transferring the knowledge from multiple teachers to a single small student model. With multi-teacher knowledge distillation, students learn better than single-teacher knowledge distillation. Experiments show that we can reach a 2.4 × compression rate in number of parameters and 1.2 × computation reduction with 1.79% loss of accuracy on the UCF-101 dataset and 0.35% loss of accuracy on the HMDB51 dataset. |
---|---|
ISSN: | 1383-7621 1873-6165 |
DOI: | 10.1016/j.sysarc.2019.101695 |