Loading…
Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines
Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. H...
Saved in:
Published in: | Machine vision and applications 2025-01, Vol.36 (1), p.19, Article 19 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
cited_by | |
---|---|
cites | cdi_FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983 |
container_end_page | |
container_issue | 1 |
container_start_page | 19 |
container_title | Machine vision and applications |
container_volume | 36 |
creator | Sturm, Fabian Trat, Martin Sathiyababu, Rahul Allipilli, Harshitha Menz, Benjamin Hergenroether, Elke Siegel, Melanie |
description | Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements. |
doi_str_mv | 10.1007/s00138-024-01638-9 |
format | article |
fullrecord | <record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3143455575</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3143455575</sourcerecordid><originalsourceid>FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</originalsourceid><addsrcrecordid>eNotkEtLAzEUhYMoWKt_wFXAdfTmNZkspfiCggt1HTIzSZsyzdRkRqi_3rQVLpyzOOdc-BC6pXBPAdRDBqC8JsAEAVoVp8_QjArOCFWVPkcz0MXXoNklusp5AwBCKTFDvx-u9yRPO5d-QnYdTm6XXHZxtGMYIu6dTTHEFfZDwmlopjxiH6Ijq2SLdHg9bW3Eaxs7bNtjJbl2WMVw9OFwXSmlYHtsc3bbpt_jvlTzNbrwts_u5l_n6Ov56XPxSpbvL2-LxyVpqaxGwq2SWres9owL0BJU5VgDglLZVpwpRRuppbCdd8KBr2vrpWaKNRXjstI1n6O70-4uDd-Ty6PZDFOK5aXhBZGQUipZUuyUatOQc3Le7FLY2rQ3FMyBsTkxNoWxOTI2mv8BtY1wqQ</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3143455575</pqid></control><display><type>article</type><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><source>Springer Link</source><creator>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</creator><creatorcontrib>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</creatorcontrib><description>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</description><identifier>ISSN: 0932-8092</identifier><identifier>EISSN: 1432-1769</identifier><identifier>DOI: 10.1007/s00138-024-01638-9</identifier><language>eng</language><publisher>New York: Springer Nature B.V</publisher><subject>Activity recognition ; Assembly lines ; Corporate learning ; Deep learning ; Hand (anatomy) ; Labels ; Machine learning ; Moving object recognition ; Regression models ; Representations ; Robustness ; Self-supervised learning</subject><ispartof>Machine vision and applications, 2025-01, Vol.36 (1), p.19, Article 19</ispartof><rights>Copyright Springer Nature B.V. Jan 2025</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Sturm, Fabian</creatorcontrib><creatorcontrib>Trat, Martin</creatorcontrib><creatorcontrib>Sathiyababu, Rahul</creatorcontrib><creatorcontrib>Allipilli, Harshitha</creatorcontrib><creatorcontrib>Menz, Benjamin</creatorcontrib><creatorcontrib>Hergenroether, Elke</creatorcontrib><creatorcontrib>Siegel, Melanie</creatorcontrib><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><title>Machine vision and applications</title><description>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</description><subject>Activity recognition</subject><subject>Assembly lines</subject><subject>Corporate learning</subject><subject>Deep learning</subject><subject>Hand (anatomy)</subject><subject>Labels</subject><subject>Machine learning</subject><subject>Moving object recognition</subject><subject>Regression models</subject><subject>Representations</subject><subject>Robustness</subject><subject>Self-supervised learning</subject><issn>0932-8092</issn><issn>1432-1769</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2025</creationdate><recordtype>article</recordtype><recordid>eNotkEtLAzEUhYMoWKt_wFXAdfTmNZkspfiCggt1HTIzSZsyzdRkRqi_3rQVLpyzOOdc-BC6pXBPAdRDBqC8JsAEAVoVp8_QjArOCFWVPkcz0MXXoNklusp5AwBCKTFDvx-u9yRPO5d-QnYdTm6XXHZxtGMYIu6dTTHEFfZDwmlopjxiH6Ijq2SLdHg9bW3Eaxs7bNtjJbl2WMVw9OFwXSmlYHtsc3bbpt_jvlTzNbrwts_u5l_n6Ov56XPxSpbvL2-LxyVpqaxGwq2SWres9owL0BJU5VgDglLZVpwpRRuppbCdd8KBr2vrpWaKNRXjstI1n6O70-4uDd-Ty6PZDFOK5aXhBZGQUipZUuyUatOQc3Le7FLY2rQ3FMyBsTkxNoWxOTI2mv8BtY1wqQ</recordid><startdate>202501</startdate><enddate>202501</enddate><creator>Sturm, Fabian</creator><creator>Trat, Martin</creator><creator>Sathiyababu, Rahul</creator><creator>Allipilli, Harshitha</creator><creator>Menz, Benjamin</creator><creator>Hergenroether, Elke</creator><creator>Siegel, Melanie</creator><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>202501</creationdate><title>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</title><author>Sturm, Fabian ; Trat, Martin ; Sathiyababu, Rahul ; Allipilli, Harshitha ; Menz, Benjamin ; Hergenroether, Elke ; Siegel, Melanie</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2025</creationdate><topic>Activity recognition</topic><topic>Assembly lines</topic><topic>Corporate learning</topic><topic>Deep learning</topic><topic>Hand (anatomy)</topic><topic>Labels</topic><topic>Machine learning</topic><topic>Moving object recognition</topic><topic>Regression models</topic><topic>Representations</topic><topic>Robustness</topic><topic>Self-supervised learning</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Sturm, Fabian</creatorcontrib><creatorcontrib>Trat, Martin</creatorcontrib><creatorcontrib>Sathiyababu, Rahul</creatorcontrib><creatorcontrib>Allipilli, Harshitha</creatorcontrib><creatorcontrib>Menz, Benjamin</creatorcontrib><creatorcontrib>Hergenroether, Elke</creatorcontrib><creatorcontrib>Siegel, Melanie</creatorcontrib><collection>CrossRef</collection><jtitle>Machine vision and applications</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Sturm, Fabian</au><au>Trat, Martin</au><au>Sathiyababu, Rahul</au><au>Allipilli, Harshitha</au><au>Menz, Benjamin</au><au>Hergenroether, Elke</au><au>Siegel, Melanie</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines</atitle><jtitle>Machine vision and applications</jtitle><date>2025-01</date><risdate>2025</risdate><volume>36</volume><issue>1</issue><spage>19</spage><pages>19-</pages><artnum>19</artnum><issn>0932-8092</issn><eissn>1432-1769</eissn><abstract>Humans are still indispensable on industrial assembly lines, but in the event of an error, they need support from intelligent systems. In addition to the objects to be observed, it is equally important to understand the fine-grained hand movements of a human to be able to track the entire process. However, these deep-learning-based hand action recognition methods are very label intensive, which cannot be offered by all industrial companies due to the associated costs. This work therefore presents a self-supervised learning approach for industrial assembly processes that allows a spatio-temporal transformer architecture to be pre-trained on a variety of information from real-world video footage of daily life. Subsequently, this deep learning model is adapted to the industrial assembly task at hand using only a few labels. Well-known real-world datasets best suited for representation learning of such hand actions in a regression tasks are outlined and to what extent they optimize the subsequent supervised trained classification task. This subsequent fine-tuning is supplemented by concept drift detection, which makes the resulting productively employed models more robust against concept drift and future changing assembly movements.</abstract><cop>New York</cop><pub>Springer Nature B.V</pub><doi>10.1007/s00138-024-01638-9</doi></addata></record> |
fulltext | fulltext |
identifier | ISSN: 0932-8092 |
ispartof | Machine vision and applications, 2025-01, Vol.36 (1), p.19, Article 19 |
issn | 0932-8092 1432-1769 |
language | eng |
recordid | cdi_proquest_journals_3143455575 |
source | Springer Link |
subjects | Activity recognition Assembly lines Corporate learning Deep learning Hand (anatomy) Labels Machine learning Moving object recognition Regression models Representations Robustness Self-supervised learning |
title | Self-supervised representation learning for robust fine-grained human hand action recognition in industrial assembly lines |
url | http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T10%3A01%3A04IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Self-supervised%20representation%20learning%20for%20robust%20fine-grained%20human%20hand%20action%20recognition%20in%20industrial%20assembly%20lines&rft.jtitle=Machine%20vision%20and%20applications&rft.au=Sturm,%20Fabian&rft.date=2025-01&rft.volume=36&rft.issue=1&rft.spage=19&rft.pages=19-&rft.artnum=19&rft.issn=0932-8092&rft.eissn=1432-1769&rft_id=info:doi/10.1007/s00138-024-01638-9&rft_dat=%3Cproquest_cross%3E3143455575%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c156t-3a7599c28f234095076e2b04115c632771b5954adfe4e0f88af59272b62356983%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3143455575&rft_id=info:pmid/&rfr_iscdi=true |