Loading…

Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection

In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal des...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on image processing 2014-02, Vol.23 (2), p.570-581
Main Authors: Wang, Haoran, Yuan, Chunfeng, Hu, Weiming, Ling, Haibin, Yang, Wankou, Sun, Changyin
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373
cites cdi_FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373
container_end_page 581
container_issue 2
container_start_page 570
container_title IEEE transactions on image processing
container_volume 23
creator Wang, Haoran
Yuan, Chunfeng
Hu, Weiming
Ling, Haibin
Yang, Wankou
Sun, Changyin
description In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.
doi_str_mv 10.1109/TIP.2013.2292550
format article
fullrecord <record><control><sourceid>proquest_pubme</sourceid><recordid>TN_cdi_proquest_miscellaneous_1704355088</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>6675065</ieee_id><sourcerecordid>1704355088</sourcerecordid><originalsourceid>FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373</originalsourceid><addsrcrecordid>eNpFkE1r3DAQhkVpaJJt74VC8aWQizczsj7sY7LkC0JSuslZaLWjRcUru5a3kH9fZddJ0UED7_MOzMPYV4Q5IjTnT3c_5xywmnPecCnhAzvBRmAJIPjHPIPUpUbRHLPTlH4DoJCoPrFjrriGBpoTZi7cGLpY_CLXbWLYz88pxE3x0MVIGzuGv1RM0KLb9l2kOGa8Hyjlye4DG9fFsrdDouLSppCKJbW073xmR962ib5M_4w9X189LW7L-8ebu8XFfekEyrEktdLSYe0teS21JcWtXEsEteKc19WanK4rj540V7gCpaX3WgjHSVY6vxk7O-zth-7PjtJotiE5alsbqdslgxpElQ3VdUbhgLqhS2kgb_ohbO3wYhDMq1aTtZpXrWbSmivfp-271ZbW74U3jxn4MQE2Odv6wUYX0n-uFlChkpn7duACEb3HKp8DOf0HJjaI3g</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1704355088</pqid></control><display><type>article</type><title>Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection</title><source>IEEE Xplore (Online service)</source><creator>Wang, Haoran ; Yuan, Chunfeng ; Hu, Weiming ; Ling, Haibin ; Yang, Wankou ; Sun, Changyin</creator><creatorcontrib>Wang, Haoran ; Yuan, Chunfeng ; Hu, Weiming ; Ling, Haibin ; Yang, Wankou ; Sun, Changyin</creatorcontrib><description>In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.</description><identifier>ISSN: 1057-7149</identifier><identifier>EISSN: 1941-0042</identifier><identifier>DOI: 10.1109/TIP.2013.2292550</identifier><identifier>PMID: 26270909</identifier><identifier>CODEN: IIPRE4</identifier><language>eng</language><publisher>New York, NY: IEEE</publisher><subject>action recognition ; Action unit ; Algorithms ; Applied sciences ; Context ; Detection, estimation, filtering, equalization, prediction ; Dictionaries ; Exact sciences and technology ; Feature extraction ; Humans ; Image Enhancement - methods ; Image Interpretation, Computer-Assisted - methods ; Image processing ; Imaging, Three-Dimensional - methods ; Information, signal and communications theory ; Motor Activity - physiology ; Movement - physiology ; nonnegative matrix factorization ; Pattern Recognition, Automated - methods ; Photography - methods ; Reproducibility of Results ; Semantics ; Sensitivity and Specificity ; Signal and communications theory ; Signal processing ; Signal, noise ; sparse representation ; Subtraction Technique ; Telecommunications and information theory ; Vectors ; Videos ; Visualization ; Whole Body Imaging - methods</subject><ispartof>IEEE transactions on image processing, 2014-02, Vol.23 (2), p.570-581</ispartof><rights>2015 INIST-CNRS</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373</citedby><cites>FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/6675065$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>314,777,781,27905,27906,54777</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&amp;idt=28403165$$DView record in Pascal Francis$$Hfree_for_read</backlink><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/26270909$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Wang, Haoran</creatorcontrib><creatorcontrib>Yuan, Chunfeng</creatorcontrib><creatorcontrib>Hu, Weiming</creatorcontrib><creatorcontrib>Ling, Haibin</creatorcontrib><creatorcontrib>Yang, Wankou</creatorcontrib><creatorcontrib>Sun, Changyin</creatorcontrib><title>Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection</title><title>IEEE transactions on image processing</title><addtitle>TIP</addtitle><addtitle>IEEE Trans Image Process</addtitle><description>In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.</description><subject>action recognition</subject><subject>Action unit</subject><subject>Algorithms</subject><subject>Applied sciences</subject><subject>Context</subject><subject>Detection, estimation, filtering, equalization, prediction</subject><subject>Dictionaries</subject><subject>Exact sciences and technology</subject><subject>Feature extraction</subject><subject>Humans</subject><subject>Image Enhancement - methods</subject><subject>Image Interpretation, Computer-Assisted - methods</subject><subject>Image processing</subject><subject>Imaging, Three-Dimensional - methods</subject><subject>Information, signal and communications theory</subject><subject>Motor Activity - physiology</subject><subject>Movement - physiology</subject><subject>nonnegative matrix factorization</subject><subject>Pattern Recognition, Automated - methods</subject><subject>Photography - methods</subject><subject>Reproducibility of Results</subject><subject>Semantics</subject><subject>Sensitivity and Specificity</subject><subject>Signal and communications theory</subject><subject>Signal processing</subject><subject>Signal, noise</subject><subject>sparse representation</subject><subject>Subtraction Technique</subject><subject>Telecommunications and information theory</subject><subject>Vectors</subject><subject>Videos</subject><subject>Visualization</subject><subject>Whole Body Imaging - methods</subject><issn>1057-7149</issn><issn>1941-0042</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2014</creationdate><recordtype>article</recordtype><recordid>eNpFkE1r3DAQhkVpaJJt74VC8aWQizczsj7sY7LkC0JSuslZaLWjRcUru5a3kH9fZddJ0UED7_MOzMPYV4Q5IjTnT3c_5xywmnPecCnhAzvBRmAJIPjHPIPUpUbRHLPTlH4DoJCoPrFjrriGBpoTZi7cGLpY_CLXbWLYz88pxE3x0MVIGzuGv1RM0KLb9l2kOGa8Hyjlye4DG9fFsrdDouLSppCKJbW073xmR962ib5M_4w9X189LW7L-8ebu8XFfekEyrEktdLSYe0teS21JcWtXEsEteKc19WanK4rj540V7gCpaX3WgjHSVY6vxk7O-zth-7PjtJotiE5alsbqdslgxpElQ3VdUbhgLqhS2kgb_ohbO3wYhDMq1aTtZpXrWbSmivfp-271ZbW74U3jxn4MQE2Odv6wUYX0n-uFlChkpn7duACEb3HKp8DOf0HJjaI3g</recordid><startdate>20140201</startdate><enddate>20140201</enddate><creator>Wang, Haoran</creator><creator>Yuan, Chunfeng</creator><creator>Hu, Weiming</creator><creator>Ling, Haibin</creator><creator>Yang, Wankou</creator><creator>Sun, Changyin</creator><general>IEEE</general><general>Institute of Electrical and Electronics Engineers</general><scope>97E</scope><scope>RIA</scope><scope>RIE</scope><scope>IQODW</scope><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope></search><sort><creationdate>20140201</creationdate><title>Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection</title><author>Wang, Haoran ; Yuan, Chunfeng ; Hu, Weiming ; Ling, Haibin ; Yang, Wankou ; Sun, Changyin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2014</creationdate><topic>action recognition</topic><topic>Action unit</topic><topic>Algorithms</topic><topic>Applied sciences</topic><topic>Context</topic><topic>Detection, estimation, filtering, equalization, prediction</topic><topic>Dictionaries</topic><topic>Exact sciences and technology</topic><topic>Feature extraction</topic><topic>Humans</topic><topic>Image Enhancement - methods</topic><topic>Image Interpretation, Computer-Assisted - methods</topic><topic>Image processing</topic><topic>Imaging, Three-Dimensional - methods</topic><topic>Information, signal and communications theory</topic><topic>Motor Activity - physiology</topic><topic>Movement - physiology</topic><topic>nonnegative matrix factorization</topic><topic>Pattern Recognition, Automated - methods</topic><topic>Photography - methods</topic><topic>Reproducibility of Results</topic><topic>Semantics</topic><topic>Sensitivity and Specificity</topic><topic>Signal and communications theory</topic><topic>Signal processing</topic><topic>Signal, noise</topic><topic>sparse representation</topic><topic>Subtraction Technique</topic><topic>Telecommunications and information theory</topic><topic>Vectors</topic><topic>Videos</topic><topic>Visualization</topic><topic>Whole Body Imaging - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Wang, Haoran</creatorcontrib><creatorcontrib>Yuan, Chunfeng</creatorcontrib><creatorcontrib>Hu, Weiming</creatorcontrib><creatorcontrib>Ling, Haibin</creatorcontrib><creatorcontrib>Yang, Wankou</creatorcontrib><creatorcontrib>Sun, Changyin</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005-present</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998-Present</collection><collection>IEEE Electronic Library (IEL)</collection><collection>Pascal-Francis</collection><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>IEEE transactions on image processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Wang, Haoran</au><au>Yuan, Chunfeng</au><au>Hu, Weiming</au><au>Ling, Haibin</au><au>Yang, Wankou</au><au>Sun, Changyin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection</atitle><jtitle>IEEE transactions on image processing</jtitle><stitle>TIP</stitle><addtitle>IEEE Trans Image Process</addtitle><date>2014-02-01</date><risdate>2014</risdate><volume>23</volume><issue>2</issue><spage>570</spage><epage>581</epage><pages>570-581</pages><issn>1057-7149</issn><eissn>1941-0042</eissn><coden>IIPRE4</coden><abstract>In this paper, we propose using high-level action units to represent human actions in videos and, based on such units, a novel sparse model is developed for human action recognition. There are three interconnected components in our approach. First, we propose a new context-aware spatial-temporal descriptor, named locally weighted word context, to improve the discriminability of the traditionally used local spatial-temporal descriptors. Second, from the statistics of the context-aware descriptors, we learn action units using the graph regularized nonnegative matrix factorization, which leads to a part-based representation and encodes the geometrical information. These units effectively bridge the semantic gap in action recognition. Third, we propose a sparse model based on a joint l2,1-norm to preserve the representative items and suppress noise in the action units. Intuitively, when learning the dictionary for action representation, the sparse model captures the fact that actions from the same class share similar units. The proposed approach is evaluated on several publicly available data sets. The experimental results and analysis clearly demonstrate the effectiveness of the proposed approach.</abstract><cop>New York, NY</cop><pub>IEEE</pub><pmid>26270909</pmid><doi>10.1109/TIP.2013.2292550</doi><tpages>12</tpages></addata></record>
fulltext fulltext
identifier ISSN: 1057-7149
ispartof IEEE transactions on image processing, 2014-02, Vol.23 (2), p.570-581
issn 1057-7149
1941-0042
language eng
recordid cdi_proquest_miscellaneous_1704355088
source IEEE Xplore (Online service)
subjects action recognition
Action unit
Algorithms
Applied sciences
Context
Detection, estimation, filtering, equalization, prediction
Dictionaries
Exact sciences and technology
Feature extraction
Humans
Image Enhancement - methods
Image Interpretation, Computer-Assisted - methods
Image processing
Imaging, Three-Dimensional - methods
Information, signal and communications theory
Motor Activity - physiology
Movement - physiology
nonnegative matrix factorization
Pattern Recognition, Automated - methods
Photography - methods
Reproducibility of Results
Semantics
Sensitivity and Specificity
Signal and communications theory
Signal processing
Signal, noise
sparse representation
Subtraction Technique
Telecommunications and information theory
Vectors
Videos
Visualization
Whole Body Imaging - methods
title Action Recognition Using Nonnegative Action Component Representation and Sparse Basis Selection
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-19T20%3A06%3A10IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_pubme&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Action%20Recognition%20Using%20Nonnegative%20Action%20Component%20Representation%20and%20Sparse%20Basis%20Selection&rft.jtitle=IEEE%20transactions%20on%20image%20processing&rft.au=Wang,%20Haoran&rft.date=2014-02-01&rft.volume=23&rft.issue=2&rft.spage=570&rft.epage=581&rft.pages=570-581&rft.issn=1057-7149&rft.eissn=1941-0042&rft.coden=IIPRE4&rft_id=info:doi/10.1109/TIP.2013.2292550&rft_dat=%3Cproquest_pubme%3E1704355088%3C/proquest_pubme%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c415t-e6b75c18faef757ae62a5d5106b22283dec783f1fe7261b0675ff744c2e537373%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1704355088&rft_id=info:pmid/26270909&rft_ieee_id=6675065&rfr_iscdi=true