Loading…
Learning Generative Models for Multi-Activity Body Pose Estimation
We present a method to simultaneously estimate 3D body pose and action categories from monocular video sequences. Our approach learns a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Body poses are modelled on a low-dimensional manifold obtain...
Saved in:
Published in: | International journal of computer vision 2009-06, Vol.83 (2), p.121-134 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | cdi_FETCH-LOGICAL-c377t-3f2bc741b4b5a85745526c78d63abafa9c3ae44c20da05e1ca929a904724e4623 |
---|---|
cites | cdi_FETCH-LOGICAL-c377t-3f2bc741b4b5a85745526c78d63abafa9c3ae44c20da05e1ca929a904724e4623 |
container_end_page | 134 |
container_issue | 2 |
container_start_page | 121 |
container_title | International journal of computer vision |
container_volume | 83 |
creator | Jaeggli, Tobias Koller-Meier, Esther Van Gool, Luc |
description | We present a method to simultaneously estimate 3D body pose and action categories from monocular video sequences. Our approach learns a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Body poses are modelled on a low-dimensional manifold obtained by Locally Linear Embedding dimensionality reduction. In addition, we learn a prior model of likely body poses and a dynamical model in this pose manifold. Sparse kernel regressors capture the nonlinearities of this mapping efficiently. Within a Recursive Bayesian Sampling framework, the potentially multimodal posterior probability distributions can then be inferred. An activity-switching mechanism based on learned transfer functions allows for inference of the performed activity class, along with the estimation of body pose and 2D image location of the subject. Using a rough foreground segmentation, we compare Binary PCA and distance transforms to encode the appearance. As a postprocessing step, the globally optimal trajectory through the entire sequence is estimated, yielding a single pose estimate per frame that is consistent throughout the sequence. We evaluate the algorithm on challenging sequences with subjects that are alternating between running and walking movements. Our experiments show how the dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type. |
doi_str_mv | 10.1007/s11263-008-0158-0 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_33753885</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>33753885</sourcerecordid><originalsourceid>FETCH-LOGICAL-c377t-3f2bc741b4b5a85745526c78d63abafa9c3ae44c20da05e1ca929a904724e4623</originalsourceid><addsrcrecordid>eNp1kE1LAzEQhoMoWKs_wNuC6C06-drsHttSq9CiBz2HbJotW7ZJTXaF_ntTVkQELzMw88w7My9C1wTuCYB8iITQnGGAAgMRKZygERGSYcJBnKIRlBSwyEtyji5i3AIALSgboenS6uAat8kW1tmgu-bTZiu_tm3Mah-yVd92DZ6YVG-6Qzb160P26qPN5rFrdgn37hKd1bqN9uo7j9H74_xt9oSXL4vn2WSJDZOyw6ymlZGcVLwSuhCSC0FzI4t1znSla10api3nhsJag7DE6JKWugQuKbc8p2yM7gbdffAfvY2d2jXR2LbVzvo-KsakYEUhEnjzB9z6Prh0myKEJKqUAIkiA2WCjzHYWu1D-igcFAF19FQNnqrkqTp6qo4zt9_KOhrd1kE708SfQUo4ozytGCM6cDG13MaGXxf8K_4FAlKEoA</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>1113759700</pqid></control><display><type>article</type><title>Learning Generative Models for Multi-Activity Body Pose Estimation</title><source>ABI/INFORM Collection</source><source>Springer Nature</source><creator>Jaeggli, Tobias ; Koller-Meier, Esther ; Van Gool, Luc</creator><creatorcontrib>Jaeggli, Tobias ; Koller-Meier, Esther ; Van Gool, Luc</creatorcontrib><description>We present a method to simultaneously estimate 3D body pose and action categories from monocular video sequences. Our approach learns a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Body poses are modelled on a low-dimensional manifold obtained by Locally Linear Embedding dimensionality reduction. In addition, we learn a prior model of likely body poses and a dynamical model in this pose manifold. Sparse kernel regressors capture the nonlinearities of this mapping efficiently. Within a Recursive Bayesian Sampling framework, the potentially multimodal posterior probability distributions can then be inferred. An activity-switching mechanism based on learned transfer functions allows for inference of the performed activity class, along with the estimation of body pose and 2D image location of the subject. Using a rough foreground segmentation, we compare Binary PCA and distance transforms to encode the appearance. As a postprocessing step, the globally optimal trajectory through the entire sequence is estimated, yielding a single pose estimate per frame that is consistent throughout the sequence. We evaluate the algorithm on challenging sequences with subjects that are alternating between running and walking movements. Our experiments show how the dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type.</description><identifier>ISSN: 0920-5691</identifier><identifier>EISSN: 1573-1405</identifier><identifier>DOI: 10.1007/s11263-008-0158-0</identifier><language>eng</language><publisher>Boston: Springer US</publisher><subject>Applied sciences ; Artificial Intelligence ; Computer Imaging ; Computer Science ; Computer science; control theory; systems ; Exact sciences and technology ; Image Processing and Computer Vision ; Pattern Recognition ; Pattern Recognition and Graphics ; Pattern recognition. Digital image processing. Computational geometry ; Studies ; Vision</subject><ispartof>International journal of computer vision, 2009-06, Vol.83 (2), p.121-134</ispartof><rights>Springer Science+Business Media, LLC 2008</rights><rights>2009 INIST-CNRS</rights><rights>Springer Science+Business Media, LLC 2009</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c377t-3f2bc741b4b5a85745526c78d63abafa9c3ae44c20da05e1ca929a904724e4623</citedby><cites>FETCH-LOGICAL-c377t-3f2bc741b4b5a85745526c78d63abafa9c3ae44c20da05e1ca929a904724e4623</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktopdf>$$Uhttps://www.proquest.com/docview/1113759700/fulltextPDF?pq-origsite=primo$$EPDF$$P50$$Gproquest$$H</linktopdf><linktohtml>$$Uhttps://www.proquest.com/docview/1113759700?pq-origsite=primo$$EHTML$$P50$$Gproquest$$H</linktohtml><link.rule.ids>309,310,314,780,784,789,790,11687,23929,23930,25139,27923,27924,36059,36060,44362,74666</link.rule.ids><backlink>$$Uhttp://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=21432411$$DView record in Pascal Francis$$Hfree_for_read</backlink></links><search><creatorcontrib>Jaeggli, Tobias</creatorcontrib><creatorcontrib>Koller-Meier, Esther</creatorcontrib><creatorcontrib>Van Gool, Luc</creatorcontrib><title>Learning Generative Models for Multi-Activity Body Pose Estimation</title><title>International journal of computer vision</title><addtitle>Int J Comput Vis</addtitle><description>We present a method to simultaneously estimate 3D body pose and action categories from monocular video sequences. Our approach learns a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Body poses are modelled on a low-dimensional manifold obtained by Locally Linear Embedding dimensionality reduction. In addition, we learn a prior model of likely body poses and a dynamical model in this pose manifold. Sparse kernel regressors capture the nonlinearities of this mapping efficiently. Within a Recursive Bayesian Sampling framework, the potentially multimodal posterior probability distributions can then be inferred. An activity-switching mechanism based on learned transfer functions allows for inference of the performed activity class, along with the estimation of body pose and 2D image location of the subject. Using a rough foreground segmentation, we compare Binary PCA and distance transforms to encode the appearance. As a postprocessing step, the globally optimal trajectory through the entire sequence is estimated, yielding a single pose estimate per frame that is consistent throughout the sequence. We evaluate the algorithm on challenging sequences with subjects that are alternating between running and walking movements. Our experiments show how the dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type.</description><subject>Applied sciences</subject><subject>Artificial Intelligence</subject><subject>Computer Imaging</subject><subject>Computer Science</subject><subject>Computer science; control theory; systems</subject><subject>Exact sciences and technology</subject><subject>Image Processing and Computer Vision</subject><subject>Pattern Recognition</subject><subject>Pattern Recognition and Graphics</subject><subject>Pattern recognition. Digital image processing. Computational geometry</subject><subject>Studies</subject><subject>Vision</subject><issn>0920-5691</issn><issn>1573-1405</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2009</creationdate><recordtype>article</recordtype><sourceid>M0C</sourceid><recordid>eNp1kE1LAzEQhoMoWKs_wNuC6C06-drsHttSq9CiBz2HbJotW7ZJTXaF_ntTVkQELzMw88w7My9C1wTuCYB8iITQnGGAAgMRKZygERGSYcJBnKIRlBSwyEtyji5i3AIALSgboenS6uAat8kW1tmgu-bTZiu_tm3Mah-yVd92DZ6YVG-6Qzb160P26qPN5rFrdgn37hKd1bqN9uo7j9H74_xt9oSXL4vn2WSJDZOyw6ymlZGcVLwSuhCSC0FzI4t1znSla10api3nhsJag7DE6JKWugQuKbc8p2yM7gbdffAfvY2d2jXR2LbVzvo-KsakYEUhEnjzB9z6Prh0myKEJKqUAIkiA2WCjzHYWu1D-igcFAF19FQNnqrkqTp6qo4zt9_KOhrd1kE708SfQUo4ozytGCM6cDG13MaGXxf8K_4FAlKEoA</recordid><startdate>20090601</startdate><enddate>20090601</enddate><creator>Jaeggli, Tobias</creator><creator>Koller-Meier, Esther</creator><creator>Van Gool, Luc</creator><general>Springer US</general><general>Springer</general><general>Springer Nature B.V</general><scope>IQODW</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>3V.</scope><scope>7SC</scope><scope>7WY</scope><scope>7WZ</scope><scope>7XB</scope><scope>87Z</scope><scope>8AL</scope><scope>8FD</scope><scope>8FE</scope><scope>8FG</scope><scope>8FK</scope><scope>8FL</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>ARAPS</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BEZIV</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>FRNLG</scope><scope>F~G</scope><scope>GNUQQ</scope><scope>HCIFZ</scope><scope>JQ2</scope><scope>K60</scope><scope>K6~</scope><scope>K7-</scope><scope>L.-</scope><scope>L7M</scope><scope>L~C</scope><scope>L~D</scope><scope>M0C</scope><scope>M0N</scope><scope>P5Z</scope><scope>P62</scope><scope>PQBIZ</scope><scope>PQBZA</scope><scope>PQEST</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PYYUZ</scope><scope>Q9U</scope></search><sort><creationdate>20090601</creationdate><title>Learning Generative Models for Multi-Activity Body Pose Estimation</title><author>Jaeggli, Tobias ; Koller-Meier, Esther ; Van Gool, Luc</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c377t-3f2bc741b4b5a85745526c78d63abafa9c3ae44c20da05e1ca929a904724e4623</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2009</creationdate><topic>Applied sciences</topic><topic>Artificial Intelligence</topic><topic>Computer Imaging</topic><topic>Computer Science</topic><topic>Computer science; control theory; systems</topic><topic>Exact sciences and technology</topic><topic>Image Processing and Computer Vision</topic><topic>Pattern Recognition</topic><topic>Pattern Recognition and Graphics</topic><topic>Pattern recognition. Digital image processing. Computational geometry</topic><topic>Studies</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Jaeggli, Tobias</creatorcontrib><creatorcontrib>Koller-Meier, Esther</creatorcontrib><creatorcontrib>Van Gool, Luc</creatorcontrib><collection>Pascal-Francis</collection><collection>CrossRef</collection><collection>ProQuest Central (Corporate)</collection><collection>Computer and Information Systems Abstracts</collection><collection>ABI/INFORM Collection</collection><collection>ABI/INFORM Global (PDF only)</collection><collection>ProQuest Central (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database (Alumni Edition)</collection><collection>Technology Research Database</collection><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>ProQuest Central (Alumni) (purchase pre-March 2016)</collection><collection>ABI/INFORM Collection (Alumni Edition)</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>Advanced Technologies & Aerospace Collection</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Business Premium Collection</collection><collection>Technology Collection</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>Business Premium Collection (Alumni)</collection><collection>ABI/INFORM Global (Corporate)</collection><collection>ProQuest Central Student</collection><collection>SciTech Premium Collection</collection><collection>ProQuest Computer Science Collection</collection><collection>ProQuest Business Collection (Alumni Edition)</collection><collection>ProQuest Business Collection</collection><collection>Computer Science Database</collection><collection>ABI/INFORM Professional Advanced</collection><collection>Advanced Technologies Database with Aerospace</collection><collection>Computer and Information Systems Abstracts – Academic</collection><collection>Computer and Information Systems Abstracts Professional</collection><collection>ABI/INFORM Collection</collection><collection>Computing Database</collection><collection>Advanced Technologies & Aerospace Database</collection><collection>ProQuest Advanced Technologies & Aerospace Collection</collection><collection>One Business (ProQuest)</collection><collection>ProQuest One Business (Alumni)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ABI/INFORM Collection China</collection><collection>ProQuest Central Basic</collection><jtitle>International journal of computer vision</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Jaeggli, Tobias</au><au>Koller-Meier, Esther</au><au>Van Gool, Luc</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Learning Generative Models for Multi-Activity Body Pose Estimation</atitle><jtitle>International journal of computer vision</jtitle><stitle>Int J Comput Vis</stitle><date>2009-06-01</date><risdate>2009</risdate><volume>83</volume><issue>2</issue><spage>121</spage><epage>134</epage><pages>121-134</pages><issn>0920-5691</issn><eissn>1573-1405</eissn><abstract>We present a method to simultaneously estimate 3D body pose and action categories from monocular video sequences. Our approach learns a generative model of the relationship of body pose and image appearance using a sparse kernel regressor. Body poses are modelled on a low-dimensional manifold obtained by Locally Linear Embedding dimensionality reduction. In addition, we learn a prior model of likely body poses and a dynamical model in this pose manifold. Sparse kernel regressors capture the nonlinearities of this mapping efficiently. Within a Recursive Bayesian Sampling framework, the potentially multimodal posterior probability distributions can then be inferred. An activity-switching mechanism based on learned transfer functions allows for inference of the performed activity class, along with the estimation of body pose and 2D image location of the subject. Using a rough foreground segmentation, we compare Binary PCA and distance transforms to encode the appearance. As a postprocessing step, the globally optimal trajectory through the entire sequence is estimated, yielding a single pose estimate per frame that is consistent throughout the sequence. We evaluate the algorithm on challenging sequences with subjects that are alternating between running and walking movements. Our experiments show how the dynamical model helps to track through poorly segmented low-resolution image sequences where tracking otherwise fails, while at the same time reliably classifying the activity type.</abstract><cop>Boston</cop><pub>Springer US</pub><doi>10.1007/s11263-008-0158-0</doi><tpages>14</tpages><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0920-5691 |
ispartof | International journal of computer vision, 2009-06, Vol.83 (2), p.121-134 |
issn | 0920-5691 1573-1405 |
language | eng |
recordid | cdi_proquest_miscellaneous_33753885 |
source | ABI/INFORM Collection; Springer Nature |
subjects | Applied sciences Artificial Intelligence Computer Imaging Computer Science Computer science control theory systems Exact sciences and technology Image Processing and Computer Vision Pattern Recognition Pattern Recognition and Graphics Pattern recognition. Digital image processing. Computational geometry Studies Vision |
title | Learning Generative Models for Multi-Activity Body Pose Estimation |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T00%3A21%3A37IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Learning%20Generative%20Models%20for%20Multi-Activity%20Body%20Pose%20Estimation&rft.jtitle=International%20journal%20of%20computer%20vision&rft.au=Jaeggli,%20Tobias&rft.date=2009-06-01&rft.volume=83&rft.issue=2&rft.spage=121&rft.epage=134&rft.pages=121-134&rft.issn=0920-5691&rft.eissn=1573-1405&rft_id=info:doi/10.1007/s11263-008-0158-0&rft_dat=%3Cproquest_cross%3E33753885%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c377t-3f2bc741b4b5a85745526c78d63abafa9c3ae44c20da05e1ca929a904724e4623%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=1113759700&rft_id=info:pmid/&rfr_iscdi=true |