Loading…

stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition

In real life, group activity recognition plays a significant and fundamental role in a variety of applications, e.g. sports video analysis, abnormal behavior detection, and intelligent surveillance. In a complex dynamic scene, a crucial yet challenging issue is how to better model the spatio-tempora...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on circuits and systems for video technology 2020-02, Vol.30 (2), p.549-565
Main Authors:	Qi, Mengshi, Wang, Yunhong, Qin, Jie, Li, Annan, Luo, Jiebo, Van Gool, Luc
Format:	Article
Language:	English
Subjects:	Action Recognition Activity recognition Adaptation models Group Activity Recognition Hidden Markov models Message passing Performance evaluation Recurrent neural networks RNN Scene Understanding Semantic Graph Semantics Spatio-temporal Attention Sports Task analysis
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c339t-1780051a77a935b145ee475b64b1d31fac439cbc03e319d42d3fba21de15a373
cites	cdi_FETCH-LOGICAL-c339t-1780051a77a935b145ee475b64b1d31fac439cbc03e319d42d3fba21de15a373
container_end_page	565
container_issue	2
container_start_page	549
container_title	IEEE transactions on circuits and systems for video technology
container_volume	30
creator	Qi, Mengshi Wang, Yunhong Qin, Jie Li, Annan Luo, Jiebo Van Gool, Luc
description	In real life, group activity recognition plays a significant and fundamental role in a variety of applications, e.g. sports video analysis, abnormal behavior detection, and intelligent surveillance. In a complex dynamic scene, a crucial yet challenging issue is how to better model the spatio-temporal contextual information and inter-person relationship. In this paper, we present a novel attentive semantic recurrent neural network (RNN), namely, stagNet, for understanding group activities and individual actions in videos, by combining the spatio-temporal attention mechanism and semantic graph modeling. Specifically, a structured semantic graph is explicitly modeled to express the spatial contextual content of the whole scene, which is further incorporated with the temporal factor through structural-RNN. By virtue of the "factor sharing" and "message passing" mechanisms, our stagNet is capable of extracting discriminative and informative spatio-temporal representations and capturing inter-person relationships. Moreover, we adopt a spatio-temporal attention model to focus on key persons/frames for improved recognition performance. Besides, a body-region attention and a global-part feature pooling strategy are devised for individual action recognition. In experiments, four widely-used public datasets are adopted for performance evaluation, and the extensive results demonstrate the superiority and effectiveness of our method.
doi_str_mv	10.1109/TCSVT.2019.2894161
format	article
fullrecord	<record><control><sourceid>proquest_ieee_</sourceid><recordid>TN_cdi_ieee_primary_8621027</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>8621027</ieee_id><sourcerecordid>2352190053</sourcerecordid><originalsourceid>FETCH-LOGICAL-c339t-1780051a77a935b145ee475b64b1d31fac439cbc03e319d42d3fba21de15a373</originalsourceid><addsrcrecordid>eNo9UMtOwzAQtBBIlMIPwMUS5xSvH03MLaqgVKqK1EZwtBzHqVK1TnEcpP497kOcdnZ3ZnY1CD0CGQEQ-VJMVl_FiBKQI5pJDmO4QgMQIksoJeI6YiIgySiIW3TXdRtCgGc8HaDvLuj1woZXnDuch2BdaH4tXtmdjsjg5WKB69bjqW_7Pc5N3DbhgLWr8MxVsal6vT3NW4eX1rRr1xzxPbqp9bazD5c6RMX7WzH5SOaf09kknyeGMRkSSDMSP9NpqiUTJXBhLU9FOeYlVAxqbTiTpjSEWQay4rRidakpVBaEZikbouez7d63P73tgtq0vXfxoqJMUJDRnUUWPbOMb7vO21rtfbPT_qCAqGN-6pSfOuanLvlF0dNZ1Fhr_wXZmAKhKfsDF9Jrfg</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2352190053</pqid></control><display><type>article</type><title>stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition</title><source>IEEE Xplore (Online service)</source><creator>Qi, Mengshi ; Wang, Yunhong ; Qin, Jie ; Li, Annan ; Luo, Jiebo ; Van Gool, Luc</creator><creatorcontrib>Qi, Mengshi ; Wang, Yunhong ; Qin, Jie ; Li, Annan ; Luo, Jiebo ; Van Gool, Luc</creatorcontrib><description>In real life, group activity recognition plays a significant and fundamental role in a variety of applications, e.g. sports video analysis, abnormal behavior detection, and intelligent surveillance. In a complex dynamic scene, a crucial yet challenging issue is how to better model the spatio-temporal contextual information and inter-person relationship. In this paper, we present a novel attentive semantic recurrent neural network (RNN), namely, stagNet, for understanding group activities and individual actions in videos, by combining the spatio-temporal attention mechanism and semantic graph modeling. Specifically, a structured semantic graph is explicitly modeled to express the spatial contextual content of the whole scene, which is further incorporated with the temporal factor through structural-RNN. By virtue of the "factor sharing" and "message passing" mechanisms, our stagNet is capable of extracting discriminative and informative spatio-temporal representations and capturing inter-person relationships. Moreover, we adopt a spatio-temporal attention model to focus on key persons/frames for improved recognition performance. Besides, a body-region attention and a global-part feature pooling strategy are devised for individual action recognition. In experiments, four widely-used public datasets are adopted for performance evaluation, and the extensive results demonstrate the superiority and effectiveness of our method.</description><identifier>ISSN: 1051-8215</identifier><identifier>EISSN: 1558-2205</identifier><identifier>DOI: 10.1109/TCSVT.2019.2894161</identifier><identifier>CODEN: ITCTEM</identifier><language>eng</language><publisher>New York: IEEE</publisher><subject>Action Recognition ; Activity recognition ; Adaptation models ; Group Activity Recognition ; Hidden Markov models ; Message passing ; Performance evaluation ; Recurrent neural networks ; RNN ; Scene Understanding ; Semantic Graph ; Semantics ; Spatio-temporal Attention ; Sports ; Task analysis</subject><ispartof>IEEE transactions on circuits and systems for video technology, 2020-02, Vol.30 (2), p.549-565</ispartof><rights>Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c339t-1780051a77a935b145ee475b64b1d31fac439cbc03e319d42d3fba21de15a373</citedby><cites>FETCH-LOGICAL-c339t-1780051a77a935b145ee475b64b1d31fac439cbc03e319d42d3fba21de15a373</cites><orcidid>0000-0002-0306-534X ; 0000-0001-8001-2703 ; 0000-0002-4516-9729 ; 0000-0002-6955-6635</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/8621027$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,780,784,27924,27925,54796</link.rule.ids></links><search><creatorcontrib>Qi, Mengshi</creatorcontrib><creatorcontrib>Wang, Yunhong</creatorcontrib><creatorcontrib>Qin, Jie</creatorcontrib><creatorcontrib>Li, Annan</creatorcontrib><creatorcontrib>Luo, Jiebo</creatorcontrib><creatorcontrib>Van Gool, Luc</creatorcontrib><title>stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition</title><title>IEEE transactions on circuits and systems for video technology</title><addtitle>TCSVT</addtitle><description>In real life, group activity recognition plays a significant and fundamental role in a variety of applications, e.g. sports video analysis, abnormal behavior detection, and intelligent surveillance. In a complex dynamic scene, a crucial yet challenging issue is how to better model the spatio-temporal contextual information and inter-person relationship. In this paper, we present a novel attentive semantic recurrent neural network (RNN), namely, stagNet, for understanding group activities and individual actions in videos, by combining the spatio-temporal attention mechanism and semantic graph modeling. Specifically, a structured semantic graph is explicitly modeled to express the spatial contextual content of the whole scene, which is further incorporated with the temporal factor through structural-RNN. By virtue of the "factor sharing" and "message passing" mechanisms, our stagNet is capable of extracting discriminative and informative spatio-temporal representations and capturing inter-person relationships. Moreover, we adopt a spatio-temporal attention model to focus on key persons/frames for improved recognition performance. Besides, a body-region attention and a global-part feature pooling strategy are devised for individual action recognition. In experiments, four widely-used public datasets are adopted for performance evaluation, and the extensive results demonstrate the superiority and effectiveness of our method.</description><subject>Action Recognition</subject><subject>Activity recognition</subject><subject>Adaptation models</subject><subject>Group Activity Recognition</subject><subject>Hidden Markov models</subject><subject>Message passing</subject><subject>Performance evaluation</subject><subject>Recurrent neural networks</subject><subject>RNN</subject><subject>Scene Understanding</subject><subject>Semantic Graph</subject><subject>Semantics</subject><subject>Spatio-temporal Attention</subject><subject>Sports</subject><subject>Task analysis</subject><issn>1051-8215</issn><issn>1558-2205</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2020</creationdate><recordtype>article</recordtype><recordid>eNo9UMtOwzAQtBBIlMIPwMUS5xSvH03MLaqgVKqK1EZwtBzHqVK1TnEcpP497kOcdnZ3ZnY1CD0CGQEQ-VJMVl_FiBKQI5pJDmO4QgMQIksoJeI6YiIgySiIW3TXdRtCgGc8HaDvLuj1woZXnDuch2BdaH4tXtmdjsjg5WKB69bjqW_7Pc5N3DbhgLWr8MxVsal6vT3NW4eX1rRr1xzxPbqp9bazD5c6RMX7WzH5SOaf09kknyeGMRkSSDMSP9NpqiUTJXBhLU9FOeYlVAxqbTiTpjSEWQay4rRidakpVBaEZikbouez7d63P73tgtq0vXfxoqJMUJDRnUUWPbOMb7vO21rtfbPT_qCAqGN-6pSfOuanLvlF0dNZ1Fhr_wXZmAKhKfsDF9Jrfg</recordid><startdate>20200201</startdate><enddate>20200201</enddate><creator>Qi, Mengshi</creator><creator>Wang, Yunhong</creator><creator>Qin, Jie</creator><creator>Li, Annan</creator><creator>Luo, Jiebo</creator><creator>Van Gool, Luc</creator><general>IEEE</general><general>The Institute of Electrical and Electronics Engineers, Inc. (IEEE)</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7SC</scope><scope>7SP</scope><scope>8FD</scope><scope>JQ2</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><orcidid>https://orcid.org/0000-0002-0306-534X</orcidid><orcidid>https://orcid.org/0000-0001-8001-2703</orcidid><orcidid>https://orcid.org/0000-0002-4516-9729</orcidid><orcidid>https://orcid.org/0000-0002-6955-6635</orcidid></search><sort><creationdate>20200201</creationdate><title>stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition</title><author>Qi, Mengshi ; Wang, Yunhong ; Qin, Jie ; Li, Annan ; Luo, Jiebo ; Van Gool, Luc</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c339t-1780051a77a935b145ee475b64b1d31fac439cbc03e319d42d3fba21de15a373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2020</creationdate><topic>Action Recognition</topic><topic>Activity recognition</topic><topic>Adaptation models</topic><topic>Group Activity Recognition</topic><topic>Hidden Markov models</topic><topic>Message passing</topic><topic>Performance evaluation</topic><topic>Recurrent neural networks</topic><topic>RNN</topic><topic>Scene Understanding</topic><topic>Semantic Graph</topic><topic>Semantics</topic><topic>Spatio-temporal Attention</topic><topic>Sports</topic><topic>Task analysis</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Qi, Mengshi</creatorcontrib><creatorcontrib>Wang, Yunhong</creatorcontrib><creatorcontrib>Qin, Jie</creatorcontrib><creatorcontrib>Li, Annan</creatorcontrib><creatorcontrib>Luo, Jiebo</creatorcontrib><creatorcontrib>Van Gool, Luc</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Xplore (IEEE/IET Electronic Library - IEL)</collection><collection>CrossRef</collection><collection>Computer and Information Systems Abstracts</collection><collection>Electronics & Communications Abstracts</collection><collection>Technology Research Database</collection><collection>ProQuest Computer Science Collection</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><jtitle>IEEE transactions on circuits and systems for video technology</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qi, Mengshi</au><au>Wang, Yunhong</au><au>Qin, Jie</au><au>Li, Annan</au><au>Luo, Jiebo</au><au>Van Gool, Luc</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition</atitle><jtitle>IEEE transactions on circuits and systems for video technology</jtitle><stitle>TCSVT</stitle><date>2020-02-01</date><risdate>2020</risdate><volume>30</volume><issue>2</issue><spage>549</spage><epage>565</epage><pages>549-565</pages><issn>1051-8215</issn><eissn>1558-2205</eissn><coden>ITCTEM</coden><abstract>In real life, group activity recognition plays a significant and fundamental role in a variety of applications, e.g. sports video analysis, abnormal behavior detection, and intelligent surveillance. In a complex dynamic scene, a crucial yet challenging issue is how to better model the spatio-temporal contextual information and inter-person relationship. In this paper, we present a novel attentive semantic recurrent neural network (RNN), namely, stagNet, for understanding group activities and individual actions in videos, by combining the spatio-temporal attention mechanism and semantic graph modeling. Specifically, a structured semantic graph is explicitly modeled to express the spatial contextual content of the whole scene, which is further incorporated with the temporal factor through structural-RNN. By virtue of the "factor sharing" and "message passing" mechanisms, our stagNet is capable of extracting discriminative and informative spatio-temporal representations and capturing inter-person relationships. Moreover, we adopt a spatio-temporal attention model to focus on key persons/frames for improved recognition performance. Besides, a body-region attention and a global-part feature pooling strategy are devised for individual action recognition. In experiments, four widely-used public datasets are adopted for performance evaluation, and the extensive results demonstrate the superiority and effectiveness of our method.</abstract><cop>New York</cop><pub>IEEE</pub><doi>10.1109/TCSVT.2019.2894161</doi><tpages>17</tpages><orcidid>https://orcid.org/0000-0002-0306-534X</orcidid><orcidid>https://orcid.org/0000-0001-8001-2703</orcidid><orcidid>https://orcid.org/0000-0002-4516-9729</orcidid><orcidid>https://orcid.org/0000-0002-6955-6635</orcidid><oa>free_for_read</oa></addata></record>
fulltext	fulltext
identifier	ISSN: 1051-8215
ispartof	IEEE transactions on circuits and systems for video technology, 2020-02, Vol.30 (2), p.549-565
issn	1051-8215 1558-2205
language	eng
recordid	cdi_ieee_primary_8621027
source	IEEE Xplore (Online service)
subjects	Action Recognition Activity recognition Adaptation models Group Activity Recognition Hidden Markov models Message passing Performance evaluation Recurrent neural networks RNN Scene Understanding Semantic Graph Semantics Spatio-temporal Attention Sports Task analysis
title	stagNet: An Attentive Semantic RNN for Group Activity and Individual Action Recognition
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2024-12-29T19%3A58%3A59IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_ieee_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=stagNet:%20An%20Attentive%20Semantic%20RNN%20for%20Group%20Activity%20and%20Individual%20Action%20Recognition&rft.jtitle=IEEE%20transactions%20on%20circuits%20and%20systems%20for%20video%20technology&rft.au=Qi,%20Mengshi&rft.date=2020-02-01&rft.volume=30&rft.issue=2&rft.spage=549&rft.epage=565&rft.pages=549-565&rft.issn=1051-8215&rft.eissn=1558-2205&rft.coden=ITCTEM&rft_id=info:doi/10.1109/TCSVT.2019.2894161&rft_dat=%3Cproquest_ieee_%3E2352190053%3C/proquest_ieee_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c339t-1780051a77a935b145ee475b64b1d31fac439cbc03e319d42d3fba21de15a373%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2352190053&rft_id=info:pmid/&rft_ieee_id=8621027&rfr_iscdi=true