Loading…

Collaborative Local-Global Learning for Temporal Action Proposal

Temporal action proposal generation is an essential and challenging task in video understanding, which aims to locate the temporal intervals that likely contain the actions of interest. Although great progress has been made, the problem is still far from being well solved. In particular, prevalent m...

Full description

Saved in:

Bibliographic Details
Published in:	ACM transactions on intelligent systems and technology 2021-12, Vol.12 (5), p.1-14
Main Authors:	Zhu, Yisheng, Han, Hu, Liu, Guangcan, Liu, Qingshan
Format:	Article
Language:	English
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c225t-8c3f77a97937c4735469296912bba01858d9da8ec049b523e8dbafd2e88bc38d3
cites	cdi_FETCH-LOGICAL-c225t-8c3f77a97937c4735469296912bba01858d9da8ec049b523e8dbafd2e88bc38d3
container_end_page	14
container_issue	5
container_start_page	1
container_title	ACM transactions on intelligent systems and technology
container_volume	12
creator	Zhu, Yisheng Han, Hu Liu, Guangcan Liu, Qingshan
description	Temporal action proposal generation is an essential and challenging task in video understanding, which aims to locate the temporal intervals that likely contain the actions of interest. Although great progress has been made, the problem is still far from being well solved. In particular, prevalent methods can handle well only the local dependencies (i.e., short-term dependencies) among adjacent frames but are generally powerless in dealing with the global dependencies (i.e., long-term dependencies) between distant frames. To tackle this issue, we propose CLGNet, a novel Collaborative Local-Global Learning Network for temporal action proposal. The majority of CLGNet is an integration of Temporal Convolution Network and Bidirectional Long Short-Term Memory, in which Temporal Convolution Network is responsible for local dependencies while Bidirectional Long Short-Term Memory takes charge of handling the global dependencies. Furthermore, an attention mechanism called the background suppression module is designed to guide our model to focus more on the actions. Extensive experiments on two benchmark datasets, THUMOS’14 and ActivityNet-1.3, show that the proposed method can outperform state-of-the-art methods, demonstrating the strong capability of modeling the actions with varying temporal durations.
doi_str_mv	10.1145/3466181
format	article
fullrecord	<record><control><sourceid>crossref</sourceid><recordid>TN_cdi_crossref_primary_10_1145_3466181</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>10_1145_3466181</sourcerecordid><originalsourceid>FETCH-LOGICAL-c225t-8c3f77a97937c4735469296912bba01858d9da8ec049b523e8dbafd2e88bc38d3</originalsourceid><addsrcrecordid>eNo9j01LxDAYhIMouKyLfyE3T9Xmq0luLkVXoeAeds_lzZdUsk1JiuC_t4uLc5k5DMMzCN2T-pEQLp4YbxqiyBVaUSJk1WhCr_9zzW_RppSvehHXVBO1Qs9tihFMyjAP3x53yUKsdjEZiLjzkMdh_MQhZXzwp2lpRby185BGvM9pSgXiHboJEIvfXHyNjq8vh_at6j527-22qyylYq6UZUFK0FIzablkgjcLwZnPGKiJEsppB8rbhcwIyrxyBoKjXiljmXJsjR7-dm1OpWQf-ikPJ8g_Pan78_f-8p39Am7rSmI</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Collaborative Local-Global Learning for Temporal Action Proposal</title><source>Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)</source><creator>Zhu, Yisheng ; Han, Hu ; Liu, Guangcan ; Liu, Qingshan</creator><creatorcontrib>Zhu, Yisheng ; Han, Hu ; Liu, Guangcan ; Liu, Qingshan</creatorcontrib><description>Temporal action proposal generation is an essential and challenging task in video understanding, which aims to locate the temporal intervals that likely contain the actions of interest. Although great progress has been made, the problem is still far from being well solved. In particular, prevalent methods can handle well only the local dependencies (i.e., short-term dependencies) among adjacent frames but are generally powerless in dealing with the global dependencies (i.e., long-term dependencies) between distant frames. To tackle this issue, we propose CLGNet, a novel Collaborative Local-Global Learning Network for temporal action proposal. The majority of CLGNet is an integration of Temporal Convolution Network and Bidirectional Long Short-Term Memory, in which Temporal Convolution Network is responsible for local dependencies while Bidirectional Long Short-Term Memory takes charge of handling the global dependencies. Furthermore, an attention mechanism called the background suppression module is designed to guide our model to focus more on the actions. Extensive experiments on two benchmark datasets, THUMOS’14 and ActivityNet-1.3, show that the proposed method can outperform state-of-the-art methods, demonstrating the strong capability of modeling the actions with varying temporal durations.</description><identifier>ISSN: 2157-6904</identifier><identifier>EISSN: 2157-6912</identifier><identifier>DOI: 10.1145/3466181</identifier><language>eng</language><ispartof>ACM transactions on intelligent systems and technology, 2021-12, Vol.12 (5), p.1-14</ispartof><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c225t-8c3f77a97937c4735469296912bba01858d9da8ec049b523e8dbafd2e88bc38d3</citedby><cites>FETCH-LOGICAL-c225t-8c3f77a97937c4735469296912bba01858d9da8ec049b523e8dbafd2e88bc38d3</cites><orcidid>0000-0001-9742-1984 ; 0000-0002-9428-4387</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,778,782,27907,27908</link.rule.ids></links><search><creatorcontrib>Zhu, Yisheng</creatorcontrib><creatorcontrib>Han, Hu</creatorcontrib><creatorcontrib>Liu, Guangcan</creatorcontrib><creatorcontrib>Liu, Qingshan</creatorcontrib><title>Collaborative Local-Global Learning for Temporal Action Proposal</title><title>ACM transactions on intelligent systems and technology</title><description>Temporal action proposal generation is an essential and challenging task in video understanding, which aims to locate the temporal intervals that likely contain the actions of interest. Although great progress has been made, the problem is still far from being well solved. In particular, prevalent methods can handle well only the local dependencies (i.e., short-term dependencies) among adjacent frames but are generally powerless in dealing with the global dependencies (i.e., long-term dependencies) between distant frames. To tackle this issue, we propose CLGNet, a novel Collaborative Local-Global Learning Network for temporal action proposal. The majority of CLGNet is an integration of Temporal Convolution Network and Bidirectional Long Short-Term Memory, in which Temporal Convolution Network is responsible for local dependencies while Bidirectional Long Short-Term Memory takes charge of handling the global dependencies. Furthermore, an attention mechanism called the background suppression module is designed to guide our model to focus more on the actions. Extensive experiments on two benchmark datasets, THUMOS’14 and ActivityNet-1.3, show that the proposed method can outperform state-of-the-art methods, demonstrating the strong capability of modeling the actions with varying temporal durations.</description><issn>2157-6904</issn><issn>2157-6912</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2021</creationdate><recordtype>article</recordtype><recordid>eNo9j01LxDAYhIMouKyLfyE3T9Xmq0luLkVXoeAeds_lzZdUsk1JiuC_t4uLc5k5DMMzCN2T-pEQLp4YbxqiyBVaUSJk1WhCr_9zzW_RppSvehHXVBO1Qs9tihFMyjAP3x53yUKsdjEZiLjzkMdh_MQhZXzwp2lpRby185BGvM9pSgXiHboJEIvfXHyNjq8vh_at6j527-22qyylYq6UZUFK0FIzablkgjcLwZnPGKiJEsppB8rbhcwIyrxyBoKjXiljmXJsjR7-dm1OpWQf-ikPJ8g_Pan78_f-8p39Am7rSmI</recordid><startdate>20211201</startdate><enddate>20211201</enddate><creator>Zhu, Yisheng</creator><creator>Han, Hu</creator><creator>Liu, Guangcan</creator><creator>Liu, Qingshan</creator><scope>AAYXX</scope><scope>CITATION</scope><orcidid>https://orcid.org/0000-0001-9742-1984</orcidid><orcidid>https://orcid.org/0000-0002-9428-4387</orcidid></search><sort><creationdate>20211201</creationdate><title>Collaborative Local-Global Learning for Temporal Action Proposal</title><author>Zhu, Yisheng ; Han, Hu ; Liu, Guangcan ; Liu, Qingshan</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c225t-8c3f77a97937c4735469296912bba01858d9da8ec049b523e8dbafd2e88bc38d3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2021</creationdate><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhu, Yisheng</creatorcontrib><creatorcontrib>Han, Hu</creatorcontrib><creatorcontrib>Liu, Guangcan</creatorcontrib><creatorcontrib>Liu, Qingshan</creatorcontrib><collection>CrossRef</collection><jtitle>ACM transactions on intelligent systems and technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhu, Yisheng</au><au>Han, Hu</au><au>Liu, Guangcan</au><au>Liu, Qingshan</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Collaborative Local-Global Learning for Temporal Action Proposal</atitle><jtitle>ACM transactions on intelligent systems and technology</jtitle><date>2021-12-01</date><risdate>2021</risdate><volume>12</volume><issue>5</issue><spage>1</spage><epage>14</epage><pages>1-14</pages><issn>2157-6904</issn><eissn>2157-6912</eissn><abstract>Temporal action proposal generation is an essential and challenging task in video understanding, which aims to locate the temporal intervals that likely contain the actions of interest. Although great progress has been made, the problem is still far from being well solved. In particular, prevalent methods can handle well only the local dependencies (i.e., short-term dependencies) among adjacent frames but are generally powerless in dealing with the global dependencies (i.e., long-term dependencies) between distant frames. To tackle this issue, we propose CLGNet, a novel Collaborative Local-Global Learning Network for temporal action proposal. The majority of CLGNet is an integration of Temporal Convolution Network and Bidirectional Long Short-Term Memory, in which Temporal Convolution Network is responsible for local dependencies while Bidirectional Long Short-Term Memory takes charge of handling the global dependencies. Furthermore, an attention mechanism called the background suppression module is designed to guide our model to focus more on the actions. Extensive experiments on two benchmark datasets, THUMOS’14 and ActivityNet-1.3, show that the proposed method can outperform state-of-the-art methods, demonstrating the strong capability of modeling the actions with varying temporal durations.</abstract><doi>10.1145/3466181</doi><tpages>14</tpages><orcidid>https://orcid.org/0000-0001-9742-1984</orcidid><orcidid>https://orcid.org/0000-0002-9428-4387</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 2157-6904
ispartof	ACM transactions on intelligent systems and technology, 2021-12, Vol.12 (5), p.1-14
issn	2157-6904 2157-6912
language	eng
recordid	cdi_crossref_primary_10_1145_3466181
source	Association for Computing Machinery:Jisc Collections:ACM OPEN Journals 2023-2025 (reading list)
title	Collaborative Local-Global Learning for Temporal Action Proposal
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-17T02%3A28%3A20IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-crossref&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Collaborative%20Local-Global%20Learning%20for%20Temporal%20Action%20Proposal&rft.jtitle=ACM%20transactions%20on%20intelligent%20systems%20and%20technology&rft.au=Zhu,%20Yisheng&rft.date=2021-12-01&rft.volume=12&rft.issue=5&rft.spage=1&rft.epage=14&rft.pages=1-14&rft.issn=2157-6904&rft.eissn=2157-6912&rft_id=info:doi/10.1145/3466181&rft_dat=%3Ccrossref%3E10_1145_3466181%3C/crossref%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c225t-8c3f77a97937c4735469296912bba01858d9da8ec049b523e8dbafd2e88bc38d3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rfr_iscdi=true