DeCo: Decomposition and Reconstruction for Compositional Temporal Grounding via Coarse-to-Fine Contrastive Ranking

Understanding dense action in videos is a fundamental challenge towards the generalization of vision models. Several works show that compositionality is key to achieving generalization by combining known primitive elements, especially for handling novel composited structures. Compositional temporal...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yang, Lijin, Kong, Quan, Yang, Hsuan-Kung, Kehl, Wadim, Sato, Yoichi, Kobori, Norimasa
Format:	Conference Proceeding
Language:	eng ; jpn
Subjects:	and reasoning Cognition Computer vision Grounding language Pattern recognition Semantics Task analysis Videos Vision
Online Access:	Request full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Staff View