Loading…
Weakly-supervised temporal action localization: a survey
Temporal Action Localization (TAL) is an important task of various computer vision topics such as video understanding, summarization, and analysis. In the real world, the videos are long untrimmed and contain multiple actions, where the temporal boundaries annotations are required in the fully-super...
Saved in:
Published in: | Neural computing & applications 2022-06, Vol.34 (11), p.8479-8499 |
---|---|
Main Authors: | , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Temporal Action Localization (TAL) is an important task of various computer vision topics such as video understanding, summarization, and analysis. In the real world, the videos are long untrimmed and contain multiple actions, where the temporal boundaries annotations are required in the fully-supervised learning setting for classification and localization tasks. Since the annotation task is costly and time-consuming, the trend is moving toward the weakly-supervised setting, which depends on the video-level labels only without any additional information, and this approach is called weakly-supervised Temporal Action Localization (WTAL). In this survey, we review the concepts, strategies, and techniques related to the WTAL in order to clarify all aspects of the problem and review the state-of-the-art frameworks of WTAL according to their challenges. Furthermore, a comparison of models’ performance and results based on benchmark datasets is presented. Finally, we summarize the future works to allow the researchers to improve the model's performance. |
---|---|
ISSN: | 0941-0643 1433-3058 |
DOI: | 10.1007/s00521-022-07102-x |