Loading…

Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection

In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal des...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on image processing 2014-02, Vol.23 (2), p.570-581
Main Authors:	Wang, Haoran, Yuan, Chunfeng, Hu, Weiming, Ling, Haibin, Yang, Wankou, Sun, Changyin
Format:	Article
Language:	English
Subjects:	action recognition Action unit Algorithms Applied sciences Context Detection, estimation, filtering, equalization, prediction Dictionaries Exact sciences and technology Feature extraction Humans Image Enhancement - methods Image Interpretation, Computer-Assisted - methods Image processing Imaging, Three-Dimensional - methods Information, signal and communications theory Motor Activity - physiology Movement - physiology nonnegative matrix factorization Pattern Recognition, Automated - methods Photography - methods Reproducibility of Results Semantics Sensitivity and Specificity Signal and communications theory Signal processing Signal, noise sparse representation Subtraction Technique Telecommunications and information theory Vectors Videos Visualization Whole Body Imaging - methods
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by	cdi_FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373
cites	cdi_FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373
container_end_page	581
container_issue	2
container_start_page	570
container_title	IEEE transactions on image processing
container_volume	23
creator	Wang, Haoran Yuan, Chunfeng Hu, Weiming Ling, Haibin Yang, Wankou Sun, Changyin
description	In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.
doi_str_mv	10.1109/TIP.2013.2292550
format	article
fullrecord	<record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_1704355088</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6675065</ieee_id><sourcerecordid>1704355088</sourcerecordid><originalsourceid>FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373</originalsourceid><addsrcrecordid>eNpFkE1r3DAQhkVpaJJt74VC8aWQizczsj7sY7LkC0JSuslZaLWjRcUru5a3kH9fZddJ0UED7_MOzMPYV4Q5IjTnT3c_5xywmnPecCnhAzvBRmAJIPjHPIPUpUbRHLPTlH4DoJCoPrFjrriGBpoTZi7cGLpY_CLXbWLYz88pxE3x0MVIGzuGv1RM0KLb9l2kOGa8Hyjlye4DG9fFsrdDouLSppCKJbW073xmR962ib5M_4w9X189LW7L-8ebu8XFfekEyrEktdLSYe0teS21JcWtXEsEteKc19WanK4rj540V7gCpaX3WgjHSVY6vxk7O-zth-7PjtJotiE5alsbqdslgxpElQ3VdUbhgLqhS2kgb_ohbO3wYhDMq1aTtZpXrWbSmivfp-271ZbW74U3jxn4MQE2Odv6wUYX0n-uFlChkpn7duACEb3HKp8DOf0HJjaI3g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1704355088</pqid></control><display><type>article</type><title>Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection</title><source>IEEE Xplore (Online service)</source><creator>Wang, Haoran ; Yuan, Chunfeng ; Hu, Weiming ; Ling, Haibin ; Yang, Wankou ; Sun, Changyin</creator><creatorcontrib>Wang, Haoran ; Yuan, Chunfeng ; Hu, Weiming ; Ling, Haibin ; Yang, Wankou ; Sun, Changyin</creatorcontrib><description>In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2013.2292550</identifier><identifier>PMID: 26270909</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>action recognition ; Action unit ; Algorithms ; Applied sciences ; Context ; Detection, estimation, filtering, equalization, prediction ; Dictionaries ; Exact sciences and technology ; Feature extraction ; Humans ; Image Enhancement - methods ; Image Interpretation, Computer-Assisted - methods ; Image processing ; Imaging, Three-Dimensional - methods ; Information, signal and communications theory ; Motor Activity - physiology ; Movement - physiology ; nonnegative matrix factorization ; Pattern Recognition, Automated - methods ; Photography - methods ; Reproducibility of Results ; Semantics ; Sensitivity and Specificity ; Signal and communications theory ; Signal processing ; Signal, noise ; sparse representation ; Subtraction Technique ; Telecommunications and information theory ; Vectors ; Videos ; Visualization ; Whole Body Imaging - methods</subject><ispartof>IEEE transactions on image processing, 2014-02, Vol.23 (2), p.570-581</ispartof><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373</citedby><cites>FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6675065$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,27905,27906,54777</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=28403165$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26270909$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Haoran</creatorcontrib><creatorcontrib>Yuan, Chunfeng</creatorcontrib><creatorcontrib>Hu, Weiming</creatorcontrib><creatorcontrib>Ling, Haibin</creatorcontrib><creatorcontrib>Yang, Wankou</creatorcontrib><creatorcontrib>Sun, Changyin</creatorcontrib><title>Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.</description><subject>action recognition</subject><subject>Action unit</subject><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Context</subject><subject>Detection, estimation, filtering, equalization, prediction</subject><subject>Dictionaries</subject><subject>Exact sciences and technology</subject><subject>Feature extraction</subject><subject>Humans</subject><subject>Image Enhancement - methods</subject><subject>Image Interpretation, Computer-Assisted - methods</subject><subject>Image processing</subject><subject>Imaging, Three-Dimensional - methods</subject><subject>Information, signal and communications theory</subject><subject>Motor Activity - physiology</subject><subject>Movement - physiology</subject><subject>nonnegative matrix factorization</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Photography - methods</subject><subject>Reproducibility of Results</subject><subject>Semantics</subject><subject>Sensitivity and Specificity</subject><subject>Signal and communications theory</subject><subject>Signal processing</subject><subject>Signal, noise</subject><subject>sparse representation</subject><subject>Subtraction Technique</subject><subject>Telecommunications and information theory</subject><subject>Vectors</subject><subject>Videos</subject><subject>Visualization</subject><subject>Whole Body Imaging - methods</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><recordid>eNpFkE1r3DAQhkVpaJJt74VC8aWQizczsj7sY7LkC0JSuslZaLWjRcUru5a3kH9fZddJ0UED7_MOzMPYV4Q5IjTnT3c_5xywmnPecCnhAzvBRmAJIPjHPIPUpUbRHLPTlH4DoJCoPrFjrriGBpoTZi7cGLpY_CLXbWLYz88pxE3x0MVIGzuGv1RM0KLb9l2kOGa8Hyjlye4DG9fFsrdDouLSppCKJbW073xmR962ib5M_4w9X189LW7L-8ebu8XFfekEyrEktdLSYe0teS21JcWtXEsEteKc19WanK4rj540V7gCpaX3WgjHSVY6vxk7O-zth-7PjtJotiE5alsbqdslgxpElQ3VdUbhgLqhS2kgb_ohbO3wYhDMq1aTtZpXrWbSmivfp-271ZbW74U3jxn4MQE2Odv6wUYX0n-uFlChkpn7duACEb3HKp8DOf0HJjaI3g</recordid><startdate>20140201</startdate><enddate>20140201</enddate><creator>Wang, Haoran</creator><creator>Yuan, Chunfeng</creator><creator>Hu, Weiming</creator><creator>Ling, Haibin</creator><creator>Yang, Wankou</creator><creator>Sun, Changyin</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20140201</creationdate><title>Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection</title><author>Wang, Haoran ; Yuan, Chunfeng ; Hu, Weiming ; Ling, Haibin ; Yang, Wankou ; Sun, Changyin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>action recognition</topic><topic>Action unit</topic><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Context</topic><topic>Detection, estimation, filtering, equalization, prediction</topic><topic>Dictionaries</topic><topic>Exact sciences and technology</topic><topic>Feature extraction</topic><topic>Humans</topic><topic>Image Enhancement - methods</topic><topic>Image Interpretation, Computer-Assisted - methods</topic><topic>Image processing</topic><topic>Imaging, Three-Dimensional - methods</topic><topic>Information, signal and communications theory</topic><topic>Motor Activity - physiology</topic><topic>Movement - physiology</topic><topic>nonnegative matrix factorization</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Photography - methods</topic><topic>Reproducibility of Results</topic><topic>Semantics</topic><topic>Sensitivity and Specificity</topic><topic>Signal and communications theory</topic><topic>Signal processing</topic><topic>Signal, noise</topic><topic>sparse representation</topic><topic>Subtraction Technique</topic><topic>Telecommunications and information theory</topic><topic>Vectors</topic><topic>Videos</topic><topic>Visualization</topic><topic>Whole Body Imaging - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Haoran</creatorcontrib><creatorcontrib>Yuan, Chunfeng</creatorcontrib><creatorcontrib>Hu, Weiming</creatorcontrib><creatorcontrib>Ling, Haibin</creatorcontrib><creatorcontrib>Yang, Wankou</creatorcontrib><creatorcontrib>Sun, Changyin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Haoran</au><au>Yuan, Chunfeng</au><au>Hu, Weiming</au><au>Ling, Haibin</au><au>Yang, Wankou</au><au>Sun, Changyin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2014-02-01</date><risdate>2014</risdate><volume>23</volume><issue>2</issue><spage>570</spage><epage>581</epage><pages>570-581</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.</abstract><cop>New York, NY</cop><pub>IEEE</pub><pmid>26270909</pmid><doi>10.1109/TIP.2013.2292550</doi><tpages>12</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1057-7149
ispartof	IEEE transactions on image processing, 2014-02, Vol.23 (2), p.570-581
issn	1057-7149 1941-0042
language	eng
recordid	cdi_proquest_miscellaneous_1704355088
source	IEEE Xplore (Online service)
subjects	action recognition Action unit Algorithms Applied sciences Context Detection, estimation, filtering, equalization, prediction Dictionaries Exact sciences and technology Feature extraction Humans Image Enhancement - methods Image Interpretation, Computer-Assisted - methods Image processing Imaging, Three-Dimensional - methods Information, signal and communications theory Motor Activity - physiology Movement - physiology nonnegative matrix factorization Pattern Recognition, Automated - methods Photography - methods Reproducibility of Results Semantics Sensitivity and Specificity Signal and communications theory Signal processing Signal, noise sparse representation Subtraction Technique Telecommunications and information theory Vectors Videos Visualization Whole Body Imaging - methods
title	Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A06%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Action%20Recognition%20Using%20Nonnegative%20Action%20Component%20Representation%20and%20Sparse%20Basis%20Selection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Wang,%20Haoran&rft.date=2014-02-01&rft.volume=23&rft.issue=2&rft.spage=570&rft.epage=581&rft.pages=570-581&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2013.2292550&rft_dat=%3Cproquest_pubme%3E1704355088%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1704355088&rft_id=info:pmid/26270909&rft_ieee_id=6675065&rfr_iscdi=true