Loading…
Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion
Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (meta-verses) and robotics...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Conference Proceeding |
Language: | English |
Subjects: | |
Online Access: | Request full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | 1016 |
container_issue | |
container_start_page | 1006 |
container_title | |
container_volume | |
creator | Guzov, Vladimir Chibane, Julian Marin, Riccardo He, Yannan Saracoglu, Yunus Sattler, Torsten Pons-Moll, Gerard |
description | Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (meta-verses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model is available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/. |
doi_str_mv | 10.1109/3DV62453.2024.00072 |
format | conference_proceeding |
fullrecord | <record><control><sourceid>ieee_CHZPO</sourceid><recordid>TN_cdi_ieee_primary_10550772</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10550772</ieee_id><sourcerecordid>10550772</sourcerecordid><originalsourceid>FETCH-LOGICAL-i134t-227ce3e1bed504a49d29d78aa06a713131f3596e7c190d292f11d09740f5d1573</originalsourceid><addsrcrecordid>eNpNjMtOwzAURA0SElXJF8DCP5By_YptdihQWqmoEhQWbCrXvikujVMlYcHfk6os0CxGmnM0hFwzmDAG9lY8vBdcKjHhwOUEADQ_I5nV1ggF4ojsORlxqVWujTGXJOu63aBxI5kBOyIf89Rj63wfm0Rf8LCP3t3R1bB8xbSls-_apXy52aHv6X_VpUBfPSak5adLW-zotG3qk0-fm6NzRS4qt-8w--sxeZs-rspZvlg-zcv7RR6ZkH3OufYokG0wKJBO2sBt0MY5KJxmYkgllC1Qe2ZhYLxiLIDVEioVmNJiTG5OvxER14c21q79WTNQCrTm4hfpu1Ke</addsrcrecordid><sourcetype>Publisher</sourcetype><iscdi>true</iscdi><recordtype>conference_proceeding</recordtype></control><display><type>conference_proceeding</type><title>Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion</title><source>IEEE Xplore All Conference Series</source><creator>Guzov, Vladimir ; Chibane, Julian ; Marin, Riccardo ; He, Yannan ; Saracoglu, Yunus ; Sattler, Torsten ; Pons-Moll, Gerard</creator><creatorcontrib>Guzov, Vladimir ; Chibane, Julian ; Marin, Riccardo ; He, Yannan ; Saracoglu, Yunus ; Sattler, Torsten ; Pons-Moll, Gerard</creatorcontrib><description>Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (meta-verses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model is available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/.</description><identifier>EISSN: 2475-7888</identifier><identifier>EISBN: 9798350362459</identifier><identifier>DOI: 10.1109/3DV62453.2024.00072</identifier><identifier>CODEN: IEEPAD</identifier><language>eng</language><publisher>IEEE</publisher><subject>contact prediction ; Dynamics ; egocentric vision ; human-object interaction ; Location awareness ; pose estimation ; Robot vision systems ; Three-dimensional displays ; Tracking ; Training ; Visualization ; wearable sensors</subject><ispartof>2024 International Conference on 3D Vision (3DV), 2024, p.1006-1016</ispartof><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10550772$$EHTML$$P50$$Gieee$$H</linktohtml><link.rule.ids>309,310,776,780,785,786,27904,54534,54911</link.rule.ids><linktorsrc>$$Uhttps://ieeexplore.ieee.org/document/10550772$$EView_record_in_IEEE$$FView_record_in_$$GIEEE</linktorsrc></links><search><creatorcontrib>Guzov, Vladimir</creatorcontrib><creatorcontrib>Chibane, Julian</creatorcontrib><creatorcontrib>Marin, Riccardo</creatorcontrib><creatorcontrib>He, Yannan</creatorcontrib><creatorcontrib>Saracoglu, Yunus</creatorcontrib><creatorcontrib>Sattler, Torsten</creatorcontrib><creatorcontrib>Pons-Moll, Gerard</creatorcontrib><title>Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion</title><title>2024 International Conference on 3D Vision (3DV)</title><addtitle>3DV</addtitle><description>Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (meta-verses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model is available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/.</description><subject>contact prediction</subject><subject>Dynamics</subject><subject>egocentric vision</subject><subject>human-object interaction</subject><subject>Location awareness</subject><subject>pose estimation</subject><subject>Robot vision systems</subject><subject>Three-dimensional displays</subject><subject>Tracking</subject><subject>Training</subject><subject>Visualization</subject><subject>wearable sensors</subject><issn>2475-7888</issn><isbn>9798350362459</isbn><fulltext>true</fulltext><rsrctype>conference_proceeding</rsrctype><creationdate>2024</creationdate><recordtype>conference_proceeding</recordtype><sourceid>6IE</sourceid><recordid>eNpNjMtOwzAURA0SElXJF8DCP5By_YptdihQWqmoEhQWbCrXvikujVMlYcHfk6os0CxGmnM0hFwzmDAG9lY8vBdcKjHhwOUEADQ_I5nV1ggF4ojsORlxqVWujTGXJOu63aBxI5kBOyIf89Rj63wfm0Rf8LCP3t3R1bB8xbSls-_apXy52aHv6X_VpUBfPSak5adLW-zotG3qk0-fm6NzRS4qt-8w--sxeZs-rspZvlg-zcv7RR6ZkH3OufYokG0wKJBO2sBt0MY5KJxmYkgllC1Qe2ZhYLxiLIDVEioVmNJiTG5OvxER14c21q79WTNQCrTm4hfpu1Ke</recordid><startdate>20240318</startdate><enddate>20240318</enddate><creator>Guzov, Vladimir</creator><creator>Chibane, Julian</creator><creator>Marin, Riccardo</creator><creator>He, Yannan</creator><creator>Saracoglu, Yunus</creator><creator>Sattler, Torsten</creator><creator>Pons-Moll, Gerard</creator><general>IEEE</general><scope>6IE</scope><scope>6IL</scope><scope>CBEJK</scope><scope>RIE</scope><scope>RIL</scope></search><sort><creationdate>20240318</creationdate><title>Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion</title><author>Guzov, Vladimir ; Chibane, Julian ; Marin, Riccardo ; He, Yannan ; Saracoglu, Yunus ; Sattler, Torsten ; Pons-Moll, Gerard</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-i134t-227ce3e1bed504a49d29d78aa06a713131f3596e7c190d292f11d09740f5d1573</frbrgroupid><rsrctype>conference_proceedings</rsrctype><prefilter>conference_proceedings</prefilter><language>eng</language><creationdate>2024</creationdate><topic>contact prediction</topic><topic>Dynamics</topic><topic>egocentric vision</topic><topic>human-object interaction</topic><topic>Location awareness</topic><topic>pose estimation</topic><topic>Robot vision systems</topic><topic>Three-dimensional displays</topic><topic>Tracking</topic><topic>Training</topic><topic>Visualization</topic><topic>wearable sensors</topic><toplevel>online_resources</toplevel><creatorcontrib>Guzov, Vladimir</creatorcontrib><creatorcontrib>Chibane, Julian</creatorcontrib><creatorcontrib>Marin, Riccardo</creatorcontrib><creatorcontrib>He, Yannan</creatorcontrib><creatorcontrib>Saracoglu, Yunus</creatorcontrib><creatorcontrib>Sattler, Torsten</creatorcontrib><creatorcontrib>Pons-Moll, Gerard</creatorcontrib><collection>IEEE Electronic Library (IEL) Conference Proceedings</collection><collection>IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume</collection><collection>IEEE Xplore All Conference Proceedings</collection><collection>IEEE/IET Electronic Library</collection><collection>IEEE Proceedings Order Plans (POP All) 1998-Present</collection></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext_linktorsrc</fulltext></delivery><addata><au>Guzov, Vladimir</au><au>Chibane, Julian</au><au>Marin, Riccardo</au><au>He, Yannan</au><au>Saracoglu, Yunus</au><au>Sattler, Torsten</au><au>Pons-Moll, Gerard</au><format>book</format><genre>proceeding</genre><ristype>CONF</ristype><atitle>Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion</atitle><btitle>2024 International Conference on 3D Vision (3DV)</btitle><stitle>3DV</stitle><date>2024-03-18</date><risdate>2024</risdate><spage>1006</spage><epage>1016</epage><pages>1006-1016</pages><eissn>2475-7888</eissn><eisbn>9798350362459</eisbn><coden>IEEPAD</coden><abstract>Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (meta-verses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture the interactions needs to be inexpensive and easy-to-use for non-expert users. I.e., interactions should be captured and modeled by simple ego-centric sensors such as a combination of cameras and IMU sensors, not relying on any external cameras or object trackers. Yet, to the best of our knowledge, no work tackling the challenging problem of modeling human-scene interactions via such an ego-centric sensor setup exists. This paper closes this gap in the literature by developing a novel approach that combines visual localization of humans in the scene with contact-based reasoning about human-scene interactions from IMU data. Interestingly, we can show that even without visual observations of the interactions, human-scene contacts and interactions can be realistically predicted from human pose sequences. Our method, iReplica (Interaction Replica), is an essential first step towards the egocentric capture of human interactions and modeling of dynamic scenes, which is required for future AR/VR applications in immersive virtual universes and for training machines to behave like humans. Our code, data and model is available on our project page at http://virtualhumans.mpi-inf.mpg.de/ireplica/.</abstract><pub>IEEE</pub><doi>10.1109/3DV62453.2024.00072</doi><tpages>11</tpages></addata></record> |
fulltext | fulltext_linktorsrc |
identifier | EISSN: 2475-7888 |
ispartof | 2024 International Conference on 3D Vision (3DV), 2024, p.1006-1016 |
issn | 2475-7888 |
language | eng |
recordid | cdi_ieee_primary_10550772 |
source | IEEE Xplore All Conference Series |
subjects | contact prediction Dynamics egocentric vision human-object interaction Location awareness pose estimation Robot vision systems Three-dimensional displays Tracking Training Visualization wearable sensors |
title | Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-23T05%3A22%3A13IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-ieee_CHZPO&rft_val_fmt=info:ofi/fmt:kev:mtx:book&rft.genre=proceeding&rft.atitle=Interaction%20Replica:%20Tracking%20Human-Object%20Interaction%20and%20Scene%20Changes%20From%20Human%20Motion&rft.btitle=2024%20International%20Conference%20on%203D%20Vision%20(3DV)&rft.au=Guzov,%20Vladimir&rft.date=2024-03-18&rft.spage=1006&rft.epage=1016&rft.pages=1006-1016&rft.eissn=2475-7888&rft.coden=IEEPAD&rft_id=info:doi/10.1109/3DV62453.2024.00072&rft.eisbn=9798350362459&rft_dat=%3Cieee_CHZPO%3E10550772%3C/ieee_CHZPO%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-i134t-227ce3e1bed504a49d29d78aa06a713131f3596e7c190d292f11d09740f5d1573%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10550772&rfr_iscdi=true |