Loading…
MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain
Wearable cameras allow to acquire images and videos from the user's perspective. These data can be processed to understand humans behavior. Despite human behavior analysis has been thoroughly investigated in third person vision, it is still understudied in egocentric settings and in particular...
Saved in:
Published in: | arXiv.org 2022-09 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | |
container_end_page | |
container_issue | |
container_start_page | |
container_title | arXiv.org |
container_volume | |
creator | Ragusa, Francesco Furnari, Antonino Farinella, Giovanni Maria |
description | Wearable cameras allow to acquire images and videos from the user's perspective. These data can be processed to understand humans behavior. Despite human behavior analysis has been thoroughly investigated in third person vision, it is still understudied in egocentric settings and in particular in industrial scenarios. To encourage research in this field, we present MECCANO, a multimodal dataset of egocentric videos to study humans behavior understanding in industrial-like settings. The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset. The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view, such as recognizing and anticipating human-object interactions. With the MECCANO dataset, we explored five different tasks including 1) Action Recognition, 2) Active Objects Detection and Recognition, 3) Egocentric Human-Objects Interaction Detection, 4) Action Anticipation and 5) Next-Active Objects Detection. We propose a benchmark aimed to study human behavior in the considered industrial-like scenario which demonstrates that the investigated tasks and the considered scenario are challenging for state-of-the-art algorithms. To support research in this field, we publicy release the dataset at https://iplab.dmi.unict.it/MECCANO/. |
doi_str_mv | 10.48550/arxiv.2209.08691 |
format | article |
fullrecord | <record><control><sourceid>proquest</sourceid><recordid>TN_cdi_proquest_journals_2715912101</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>2715912101</sourcerecordid><originalsourceid>FETCH-LOGICAL-a951-f3c43dbc8797c78986d65bbc8ab7fdc9e91b4af1e5a944250b6b41300745452f3</originalsourceid><addsrcrecordid>eNotjb1OwzAYRS0kJKrSB2CzxJzg3zhmK2mglVq6lLn6EjuNS-pA7FQ8PpFgOjrDuRehB0pSkUtJnmD4cdeUMaJTkmea3qAZ45wmuWDsDi1COBNCWKaYlHyG2l1ZFMv3_TNe4t3YRXfpDXS4PPW19XFwNV5BhGAjbvoBr8cL-IBfbAtXN_mHN3YIEbxx_oSdx7G1eOPNGKYUuqRznxav-gs4f49uG-iCXfxzjg6v5aFYJ9v926ZYbhPQkiYNrwU3VZ0rrWqV6zwzmawmh0o1ptZW00pAQ60ELQSTpMoqQTkhSkghWcPn6PFv9mvov0cb4vHcj4OfHo9MUakpo4TyX0uKWDo</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2715912101</pqid></control><display><type>article</type><title>MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain</title><source>Publicly Available Content Database</source><creator>Ragusa, Francesco ; Furnari, Antonino ; Farinella, Giovanni Maria</creator><creatorcontrib>Ragusa, Francesco ; Furnari, Antonino ; Farinella, Giovanni Maria</creatorcontrib><description>Wearable cameras allow to acquire images and videos from the user's perspective. These data can be processed to understand humans behavior. Despite human behavior analysis has been thoroughly investigated in third person vision, it is still understudied in egocentric settings and in particular in industrial scenarios. To encourage research in this field, we present MECCANO, a multimodal dataset of egocentric videos to study humans behavior understanding in industrial-like settings. The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset. The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view, such as recognizing and anticipating human-object interactions. With the MECCANO dataset, we explored five different tasks including 1) Action Recognition, 2) Active Objects Detection and Recognition, 3) Egocentric Human-Objects Interaction Detection, 4) Action Anticipation and 5) Next-Active Objects Detection. We propose a benchmark aimed to study human behavior in the considered industrial-like scenario which demonstrates that the investigated tasks and the considered scenario are challenging for state-of-the-art algorithms. To support research in this field, we publicy release the dataset at https://iplab.dmi.unict.it/MECCANO/.</description><identifier>EISSN: 2331-8422</identifier><identifier>DOI: 10.48550/arxiv.2209.08691</identifier><language>eng</language><publisher>Ithaca: Cornell University Library, arXiv.org</publisher><subject>Algorithms ; Datasets ; Human behavior ; Image acquisition ; Moving object recognition ; Video</subject><ispartof>arXiv.org, 2022-09</ispartof><rights>2022. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.</rights><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://www.proquest.com/docview/2715912101?pq-origsite=primo$$EHTML$$P50$$Gproquest$$Hfree_for_read</linktohtml><link.rule.ids>776,780,25730,27901,36988,44565</link.rule.ids></links><search><creatorcontrib>Ragusa, Francesco</creatorcontrib><creatorcontrib>Furnari, Antonino</creatorcontrib><creatorcontrib>Farinella, Giovanni Maria</creatorcontrib><title>MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain</title><title>arXiv.org</title><description>Wearable cameras allow to acquire images and videos from the user's perspective. These data can be processed to understand humans behavior. Despite human behavior analysis has been thoroughly investigated in third person vision, it is still understudied in egocentric settings and in particular in industrial scenarios. To encourage research in this field, we present MECCANO, a multimodal dataset of egocentric videos to study humans behavior understanding in industrial-like settings. The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset. The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view, such as recognizing and anticipating human-object interactions. With the MECCANO dataset, we explored five different tasks including 1) Action Recognition, 2) Active Objects Detection and Recognition, 3) Egocentric Human-Objects Interaction Detection, 4) Action Anticipation and 5) Next-Active Objects Detection. We propose a benchmark aimed to study human behavior in the considered industrial-like scenario which demonstrates that the investigated tasks and the considered scenario are challenging for state-of-the-art algorithms. To support research in this field, we publicy release the dataset at https://iplab.dmi.unict.it/MECCANO/.</description><subject>Algorithms</subject><subject>Datasets</subject><subject>Human behavior</subject><subject>Image acquisition</subject><subject>Moving object recognition</subject><subject>Video</subject><issn>2331-8422</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>PIMPY</sourceid><recordid>eNotjb1OwzAYRS0kJKrSB2CzxJzg3zhmK2mglVq6lLn6EjuNS-pA7FQ8PpFgOjrDuRehB0pSkUtJnmD4cdeUMaJTkmea3qAZ45wmuWDsDi1COBNCWKaYlHyG2l1ZFMv3_TNe4t3YRXfpDXS4PPW19XFwNV5BhGAjbvoBr8cL-IBfbAtXN_mHN3YIEbxx_oSdx7G1eOPNGKYUuqRznxav-gs4f49uG-iCXfxzjg6v5aFYJ9v926ZYbhPQkiYNrwU3VZ0rrWqV6zwzmawmh0o1ptZW00pAQ60ELQSTpMoqQTkhSkghWcPn6PFv9mvov0cb4vHcj4OfHo9MUakpo4TyX0uKWDo</recordid><startdate>20220919</startdate><enddate>20220919</enddate><creator>Ragusa, Francesco</creator><creator>Furnari, Antonino</creator><creator>Farinella, Giovanni Maria</creator><general>Cornell University Library, arXiv.org</general><scope>8FE</scope><scope>8FG</scope><scope>ABJCF</scope><scope>ABUWG</scope><scope>AFKRA</scope><scope>AZQEC</scope><scope>BENPR</scope><scope>BGLVJ</scope><scope>CCPQU</scope><scope>DWQXO</scope><scope>HCIFZ</scope><scope>L6V</scope><scope>M7S</scope><scope>PHGZM</scope><scope>PHGZT</scope><scope>PIMPY</scope><scope>PKEHL</scope><scope>PQEST</scope><scope>PQGLB</scope><scope>PQQKQ</scope><scope>PQUKI</scope><scope>PRINS</scope><scope>PTHSS</scope></search><sort><creationdate>20220919</creationdate><title>MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain</title><author>Ragusa, Francesco ; Furnari, Antonino ; Farinella, Giovanni Maria</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-a951-f3c43dbc8797c78986d65bbc8ab7fdc9e91b4af1e5a944250b6b41300745452f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>Algorithms</topic><topic>Datasets</topic><topic>Human behavior</topic><topic>Image acquisition</topic><topic>Moving object recognition</topic><topic>Video</topic><toplevel>online_resources</toplevel><creatorcontrib>Ragusa, Francesco</creatorcontrib><creatorcontrib>Furnari, Antonino</creatorcontrib><creatorcontrib>Farinella, Giovanni Maria</creatorcontrib><collection>ProQuest SciTech Collection</collection><collection>ProQuest Technology Collection</collection><collection>Materials Science & Engineering Collection</collection><collection>ProQuest Central (Alumni)</collection><collection>ProQuest Central</collection><collection>ProQuest Central Essentials</collection><collection>ProQuest Central</collection><collection>Technology Collection (ProQuest)</collection><collection>ProQuest One Community College</collection><collection>ProQuest Central</collection><collection>SciTech Premium Collection (Proquest) (PQ_SDU_P3)</collection><collection>ProQuest Engineering Collection</collection><collection>Engineering Database</collection><collection>ProQuest Central (New)</collection><collection>ProQuest One Academic (New)</collection><collection>Publicly Available Content Database</collection><collection>ProQuest One Academic Middle East (New)</collection><collection>ProQuest One Academic Eastern Edition (DO NOT USE)</collection><collection>ProQuest One Applied & Life Sciences</collection><collection>ProQuest One Academic</collection><collection>ProQuest One Academic UKI Edition</collection><collection>ProQuest Central China</collection><collection>Engineering collection</collection><jtitle>arXiv.org</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Ragusa, Francesco</au><au>Furnari, Antonino</au><au>Farinella, Giovanni Maria</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain</atitle><jtitle>arXiv.org</jtitle><date>2022-09-19</date><risdate>2022</risdate><eissn>2331-8422</eissn><abstract>Wearable cameras allow to acquire images and videos from the user's perspective. These data can be processed to understand humans behavior. Despite human behavior analysis has been thoroughly investigated in third person vision, it is still understudied in egocentric settings and in particular in industrial scenarios. To encourage research in this field, we present MECCANO, a multimodal dataset of egocentric videos to study humans behavior understanding in industrial-like settings. The multimodality is characterized by the presence of gaze signals, depth maps and RGB videos acquired simultaneously with a custom headset. The dataset has been explicitly labeled for fundamental tasks in the context of human behavior understanding from a first person view, such as recognizing and anticipating human-object interactions. With the MECCANO dataset, we explored five different tasks including 1) Action Recognition, 2) Active Objects Detection and Recognition, 3) Egocentric Human-Objects Interaction Detection, 4) Action Anticipation and 5) Next-Active Objects Detection. We propose a benchmark aimed to study human behavior in the considered industrial-like scenario which demonstrates that the investigated tasks and the considered scenario are challenging for state-of-the-art algorithms. To support research in this field, we publicy release the dataset at https://iplab.dmi.unict.it/MECCANO/.</abstract><cop>Ithaca</cop><pub>Cornell University Library, arXiv.org</pub><doi>10.48550/arxiv.2209.08691</doi><oa>free_for_read</oa></addata></record> |
fulltext | fulltext |
identifier | EISSN: 2331-8422 |
ispartof | arXiv.org, 2022-09 |
issn | 2331-8422 |
language | eng |
recordid | cdi_proquest_journals_2715912101 |
source | Publicly Available Content Database |
subjects | Algorithms Datasets Human behavior Image acquisition Moving object recognition Video |
title | MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-25T09%3A20%3A27IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=MECCANO:%20A%20Multimodal%20Egocentric%20Dataset%20for%20Humans%20Behavior%20Understanding%20in%20the%20Industrial-like%20Domain&rft.jtitle=arXiv.org&rft.au=Ragusa,%20Francesco&rft.date=2022-09-19&rft.eissn=2331-8422&rft_id=info:doi/10.48550/arxiv.2209.08691&rft_dat=%3Cproquest%3E2715912101%3C/proquest%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-a951-f3c43dbc8797c78986d65bbc8ab7fdc9e91b4af1e5a944250b6b41300745452f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2715912101&rft_id=info:pmid/&rfr_iscdi=true |