Loading…
Multi-scale Motion Feature Integration for Action Recognition
Analyzing video data with intricate temporal structures and extracting comprehensive motion information remains a significant challenge. In this work, we introduce the multi-scale motion feature integration (MMFI) network, which leverages two key modules for motion analysis: the progressive local co...
Saved in:
Main Authors: | , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Analyzing video data with intricate temporal structures and extracting comprehensive motion information remains a significant challenge. In this work, we introduce the multi-scale motion feature integration (MMFI) network, which leverages two key modules for motion analysis: the progressive local context aggregation (PLCA) module and the multi-scale motion excitation (MSME) module. The PLCA module captures frame-level motion details by incrementally processing frame-wise differences near the input frame in the early stages of the network. The MSME module provides motion-attentive channel weights in deeper layers with higher dimensions, incorporating short- and long-range segment-level motion information. These modules synergistically capture motion details across various scales. Our approach is evaluated on the large-scale video dataset Something-Something V1, yielding state-of-the-art performance with minimal computational overhead. |
---|---|
ISSN: | 2837-7109 |
DOI: | 10.1109/ICCC59590.2023.10507593 |