Loading…

Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines

Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. H...

Full description

Saved in:

Bibliographic Details
Published in:	Machine vision and applications 2025-01, Vol.36 (1), p.19, Article 19
Main Authors:	Sturm, Fabian, Trat, Martin, Sathiyababu, Rahul, Allipilli, Harshitha, Menz, Benjamin, Hergenroether, Elke, Siegel, Melanie
Format:	Article
Language:	English
Subjects:	Activity recognition Assembly lines Corporate learning Deep learning Hand (anatomy) Labels Machine learning Moving object recognition Regression models Representations Robustness Self-supervised learning
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983
container_end_page
container_issue	1
container_start_page	19
container_title	Machine vision and applications
container_volume	36
creator	Sturm, Fabian Trat, Martin Sathiyababu, Rahul Allipilli, Harshitha Menz, Benjamin Hergenroether, Elke Siegel, Melanie
description	Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.
doi_str_mv	10.1007/s00138-024-01638-9
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3143455575</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3143455575</sourcerecordid><originalsourceid>FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</originalsourceid><addsrcrecordid>eNotkEtLAzEUhYMoWKt_wFXAdfTmNZkspfiCggt1HTIzSZsyzdRkRqi_3rQVLpyzOOdc-BC6pXBPAdRDBqC8JsAEAVoVp8_QjArOCFWVPkcz0MXXoNklusp5AwBCKTFDvx-u9yRPO5d-QnYdTm6XXHZxtGMYIu6dTTHEFfZDwmlopjxiH6Ijq2SLdHg9bW3Eaxs7bNtjJbl2WMVw9OFwXSmlYHtsc3bbpt_jvlTzNbrwts_u5l_n6Ov56XPxSpbvL2-LxyVpqaxGwq2SWres9owL0BJU5VgDglLZVpwpRRuppbCdd8KBr2vrpWaKNRXjstI1n6O70-4uDd-Ty6PZDFOK5aXhBZGQUipZUuyUatOQc3Le7FLY2rQ3FMyBsTkxNoWxOTI2mv8BtY1wqQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3143455575</pqid></control><display><type>article</type><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><source>Springer Link</source><creator>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</creator><creatorcontrib>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</creatorcontrib><description>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</description><identifier>ISSN: 0932-8092</identifier><identifier>EISSN: 1432-1769</identifier><identifier>DOI: 10.1007/s00138-024-01638-9</identifier><language>eng</language><publisher>New York: Springer Nature B.V</publisher><subject>Activity recognition ; Assembly lines ; Corporate learning ; Deep learning ; Hand (anatomy) ; Labels ; Machine learning ; Moving object recognition ; Regression models ; Representations ; Robustness ; Self-supervised learning</subject><ispartof>Machine vision and applications, 2025-01, Vol.36 (1), p.19, Article 19</ispartof><rights>Copyright Springer Nature B.V. Jan 2025</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Sturm, Fabian</creatorcontrib><creatorcontrib>Trat, Martin</creatorcontrib><creatorcontrib>Sathiyababu, Rahul</creatorcontrib><creatorcontrib>Allipilli, Harshitha</creatorcontrib><creatorcontrib>Menz, Benjamin</creatorcontrib><creatorcontrib>Hergenroether, Elke</creatorcontrib><creatorcontrib>Siegel, Melanie</creatorcontrib><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><title>Machine vision and applications</title><description>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</description><subject>Activity recognition</subject><subject>Assembly lines</subject><subject>Corporate learning</subject><subject>Deep learning</subject><subject>Hand (anatomy)</subject><subject>Labels</subject><subject>Machine learning</subject><subject>Moving object recognition</subject><subject>Regression models</subject><subject>Representations</subject><subject>Robustness</subject><subject>Self-supervised learning</subject><issn>0932-8092</issn><issn>1432-1769</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNotkEtLAzEUhYMoWKt_wFXAdfTmNZkspfiCggt1HTIzSZsyzdRkRqi_3rQVLpyzOOdc-BC6pXBPAdRDBqC8JsAEAVoVp8_QjArOCFWVPkcz0MXXoNklusp5AwBCKTFDvx-u9yRPO5d-QnYdTm6XXHZxtGMYIu6dTTHEFfZDwmlopjxiH6Ijq2SLdHg9bW3Eaxs7bNtjJbl2WMVw9OFwXSmlYHtsc3bbpt_jvlTzNbrwts_u5l_n6Ov56XPxSpbvL2-LxyVpqaxGwq2SWres9owL0BJU5VgDglLZVpwpRRuppbCdd8KBr2vrpWaKNRXjstI1n6O70-4uDd-Ty6PZDFOK5aXhBZGQUipZUuyUatOQc3Le7FLY2rQ3FMyBsTkxNoWxOTI2mv8BtY1wqQ</recordid><startdate>202501</startdate><enddate>202501</enddate><creator>Sturm, Fabian</creator><creator>Trat, Martin</creator><creator>Sathiyababu, Rahul</creator><creator>Allipilli, Harshitha</creator><creator>Menz, Benjamin</creator><creator>Hergenroether, Elke</creator><creator>Siegel, Melanie</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>202501</creationdate><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><author>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Activity recognition</topic><topic>Assembly lines</topic><topic>Corporate learning</topic><topic>Deep learning</topic><topic>Hand (anatomy)</topic><topic>Labels</topic><topic>Machine learning</topic><topic>Moving object recognition</topic><topic>Regression models</topic><topic>Representations</topic><topic>Robustness</topic><topic>Self-supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sturm, Fabian</creatorcontrib><creatorcontrib>Trat, Martin</creatorcontrib><creatorcontrib>Sathiyababu, Rahul</creatorcontrib><creatorcontrib>Allipilli, Harshitha</creatorcontrib><creatorcontrib>Menz, Benjamin</creatorcontrib><creatorcontrib>Hergenroether, Elke</creatorcontrib><creatorcontrib>Siegel, Melanie</creatorcontrib><collection>CrossRef</collection><jtitle>Machine vision and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sturm, Fabian</au><au>Trat, Martin</au><au>Sathiyababu, Rahul</au><au>Allipilli, Harshitha</au><au>Menz, Benjamin</au><au>Hergenroether, Elke</au><au>Siegel, Melanie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</atitle><jtitle>Machine vision and applications</jtitle><date>2025-01</date><risdate>2025</risdate><volume>36</volume><issue>1</issue><spage>19</spage><pages>19-</pages><artnum>19</artnum><issn>0932-8092</issn><eissn>1432-1769</eissn><abstract>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</abstract><cop>New York</cop><pub>Springer Nature B.V</pub><doi>10.1007/s00138-024-01638-9</doi></addata></record>
fulltext	fulltext
identifier	ISSN: 0932-8092
ispartof	Machine vision and applications, 2025-01, Vol.36 (1), p.19, Article 19
issn	0932-8092 1432-1769
language	eng
recordid	cdi_proquest_journals_3143455575
source	Springer Link
subjects	Activity recognition Assembly lines Corporate learning Deep learning Hand (anatomy) Labels Machine learning Moving object recognition Regression models Representations Robustness Self-supervised learning
title	Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T10%3A01%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Self-supervised%20representation%20learning%20for%20robust%20fine-grained%20human%20hand%20action%20recognition%20in%20industrial%20assembly%20lines&rft.jtitle=Machine%20vision%20and%20applications&rft.au=Sturm,%20Fabian&rft.date=2025-01&rft.volume=36&rft.issue=1&rft.spage=19&rft.pages=19-&rft.artnum=19&rft.issn=0932-8092&rft.eissn=1432-1769&rft_id=info:doi/10.1007/s00138-024-01638-9&rft_dat=%3Cproquest_cross%3E3143455575%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3143455575&rft_id=info:pmid/&rfr_iscdi=true