Loading…
An efficient action proposal processing approach for temporal action detection
Temporal action detection is a fundamental yet challenging task in video understanding. It is important to process the action proposals for action classification and temporal boundary localization. Some methods process action proposals by exploiting the relations between them. However, learning the...
Saved in:
Published in: | Neurocomputing (Amsterdam) 2025-03, Vol.623, p.129294, Article 129294 |
---|---|
Main Authors: | , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Temporal action detection is a fundamental yet challenging task in video understanding. It is important to process the action proposals for action classification and temporal boundary localization. Some methods process action proposals by exploiting the relations between them. However, learning the relations between numerous action proposals is time-consuming and requires huge computation and memory storage. Each proposal contains contextual information extracted from video segments, and redundant information aggregation has a negative impact on the final detection performance. In this paper, we exploit an efficient model which processes each proposal individually and learn intra-proposal features adequately, avoiding the interference of redundant information to achieve more effective detection. We also design relational learning models based on mean pooling, self-attention, and temporal convolution to compare with the intra-proposal learning model. Extensive experiments show that our method outperforms the relation learning models and achieves competitive performance on the two standard benchmarks. Moreover, efficiency experiments also verify that our model is more efficient than the relation learning methods. |
---|---|
ISSN: | 0925-2312 |
DOI: | 10.1016/j.neucom.2024.129294 |