Loading…

TFRS: A task-level feature rectification and separation method for few-shot video action recognition

Few-shot video action recognition (FS-VAR) is a challenging task that requires models to have significant expressive power in order to identify previously unseen classes using only a few labeled examples. However, due to the limited number of support samples, the model’s performance is highly sensit...

Full description

Saved in:

Bibliographic Details
Published in:	Neural networks 2024-08, Vol.176, p.106326, Article 106326
Main Authors:	Qin, Yanfei, Liu, Baolin
Format:	Article
Language:	English
Subjects:	Algorithms Feature rectification Feature separation Few-shot video action recognition Humans Pattern Recognition, Automated - methods
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c311t-764b85556671277d3ff89a719352a0f9922aae85df06aa77f8e26ba635d7950f3
container_end_page
container_issue
container_start_page	106326
container_title	Neural networks
container_volume	176
creator	Qin, Yanfei Liu, Baolin
description	Few-shot video action recognition (FS-VAR) is a challenging task that requires models to have significant expressive power in order to identify previously unseen classes using only a few labeled examples. However, due to the limited number of support samples, the model’s performance is highly sensitive to the distribution of the sampled data. The representativeness of the support data is insufficient to cover the entire class, and the support features may contain shared information that confuses the classifier, leading to biased classification. In response to this difficulty, we present a task-level feature rectification and separation (TFRS) method that effectively resolves the sample bias issue. Our main idea is to leverage prior information from base classes to rectify the support samples while removing the commonality of task-level features. This enhances the distinguishability and separability of features in space. Furthermore, TFRS offers a straightforward yet versatile solution that can be seamlessly integrated into various established FS-VAR frameworks. Our design yields significant performance enhancements across various existing works by implementing TFRS, resulting in competitive outcomes on datasets such as UCF101, Kinetics, SSv2, and HMDB51. •We rectifies support sample by utilizing information from similar base prototypes.•We remove the projection onto the shared centroid.•Our method can be applied to various established FS-VAR frameworks.•We emphasize the importance of reducing the widely prevalent sample selection bias.
doi_str_mv	10.1016/j.neunet.2024.106326
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_miscellaneous_3049717058</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S0893608024002508</els_id><sourcerecordid>3049717058</sourcerecordid><originalsourceid>FETCH-LOGICAL-c311t-764b85556671277d3ff89a719352a0f9922aae85df06aa77f8e26ba635d7950f3</originalsourceid><addsrcrecordid>eNp9kMtOxCAUhonR6Hh5A2NYuunIpQXqwsQYb4mJiZc1YcpBGTtlBDrGt7dj1aWrwyHfzx8-hA4pmVJCxcl82kHfQZ4ywsrhSnAmNtCEKlkXTCq2iSZE1bwQRJEdtJvSnBAiVMm30Q5XQikixATZp6uHx1N8jrNJb0ULK2ixA5P7CDhCk73zjck-dNh0FidYmjiuC8ivwWIX4sB_FOk1ZLzyFgI2zTcwpMNL59fnfbTlTJvg4Gfuoeery6eLm-Lu_vr24vyuaDiluZCinKmqqoSQlElpuXOqNpLWvGKGuLpmzBhQlXVEGCOlU8DEzAheWVlXxPE9dDy-u4zhvYeU9cKnBtrWdBD6pDkpa0klqdSAliPaxJBSBKeX0S9M_NSU6LVfPdejX732q0e_Q-zop6GfLcD-hX6FDsDZCMDwz5WHqFPjoWvA-rVObYP_v-EL9siN0Q</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3049717058</pqid></control><display><type>article</type><title>TFRS: A task-level feature rectification and separation method for few-shot video action recognition</title><source>ScienceDirect Freedom Collection</source><creator>Qin, Yanfei ; Liu, Baolin</creator><creatorcontrib>Qin, Yanfei ; Liu, Baolin</creatorcontrib><description>Few-shot video action recognition (FS-VAR) is a challenging task that requires models to have significant expressive power in order to identify previously unseen classes using only a few labeled examples. However, due to the limited number of support samples, the model’s performance is highly sensitive to the distribution of the sampled data. The representativeness of the support data is insufficient to cover the entire class, and the support features may contain shared information that confuses the classifier, leading to biased classification. In response to this difficulty, we present a task-level feature rectification and separation (TFRS) method that effectively resolves the sample bias issue. Our main idea is to leverage prior information from base classes to rectify the support samples while removing the commonality of task-level features. This enhances the distinguishability and separability of features in space. Furthermore, TFRS offers a straightforward yet versatile solution that can be seamlessly integrated into various established FS-VAR frameworks. Our design yields significant performance enhancements across various existing works by implementing TFRS, resulting in competitive outcomes on datasets such as UCF101, Kinetics, SSv2, and HMDB51. •We rectifies support sample by utilizing information from similar base prototypes.•We remove the projection onto the shared centroid.•Our method can be applied to various established FS-VAR frameworks.•We emphasize the importance of reducing the widely prevalent sample selection bias.</description><identifier>ISSN: 0893-6080</identifier><identifier>ISSN: 1879-2782</identifier><identifier>EISSN: 1879-2782</identifier><identifier>DOI: 10.1016/j.neunet.2024.106326</identifier><identifier>PMID: 38688066</identifier><language>eng</language><publisher>United States: Elsevier Ltd</publisher><subject>Algorithms ; Feature rectification ; Feature separation ; Few-shot video action recognition ; Humans ; Pattern Recognition, Automated - methods</subject><ispartof>Neural networks, 2024-08, Vol.176, p.106326, Article 106326</ispartof><rights>2024 Elsevier Ltd</rights><rights>Copyright © 2024 Elsevier Ltd. All rights reserved.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c311t-764b85556671277d3ff89a719352a0f9922aae85df06aa77f8e26ba635d7950f3</cites><orcidid>0000-0003-0975-2316</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27922,27923</link.rule.ids><backlink>$$Uhttps://www.ncbi.nlm.nih.gov/pubmed/38688066$$D View this record in MEDLINE/PubMed$$Hfree_for_read</backlink></links><search><creatorcontrib>Qin, Yanfei</creatorcontrib><creatorcontrib>Liu, Baolin</creatorcontrib><title>TFRS: A task-level feature rectification and separation method for few-shot video action recognition</title><title>Neural networks</title><addtitle>Neural Netw</addtitle><description>Few-shot video action recognition (FS-VAR) is a challenging task that requires models to have significant expressive power in order to identify previously unseen classes using only a few labeled examples. However, due to the limited number of support samples, the model’s performance is highly sensitive to the distribution of the sampled data. The representativeness of the support data is insufficient to cover the entire class, and the support features may contain shared information that confuses the classifier, leading to biased classification. In response to this difficulty, we present a task-level feature rectification and separation (TFRS) method that effectively resolves the sample bias issue. Our main idea is to leverage prior information from base classes to rectify the support samples while removing the commonality of task-level features. This enhances the distinguishability and separability of features in space. Furthermore, TFRS offers a straightforward yet versatile solution that can be seamlessly integrated into various established FS-VAR frameworks. Our design yields significant performance enhancements across various existing works by implementing TFRS, resulting in competitive outcomes on datasets such as UCF101, Kinetics, SSv2, and HMDB51. •We rectifies support sample by utilizing information from similar base prototypes.•We remove the projection onto the shared centroid.•Our method can be applied to various established FS-VAR frameworks.•We emphasize the importance of reducing the widely prevalent sample selection bias.</description><subject>Algorithms</subject><subject>Feature rectification</subject><subject>Feature separation</subject><subject>Few-shot video action recognition</subject><subject>Humans</subject><subject>Pattern Recognition, Automated - methods</subject><issn>0893-6080</issn><issn>1879-2782</issn><issn>1879-2782</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kMtOxCAUhonR6Hh5A2NYuunIpQXqwsQYb4mJiZc1YcpBGTtlBDrGt7dj1aWrwyHfzx8-hA4pmVJCxcl82kHfQZ4ywsrhSnAmNtCEKlkXTCq2iSZE1bwQRJEdtJvSnBAiVMm30Q5XQikixATZp6uHx1N8jrNJb0ULK2ixA5P7CDhCk73zjck-dNh0FidYmjiuC8ivwWIX4sB_FOk1ZLzyFgI2zTcwpMNL59fnfbTlTJvg4Gfuoeery6eLm-Lu_vr24vyuaDiluZCinKmqqoSQlElpuXOqNpLWvGKGuLpmzBhQlXVEGCOlU8DEzAheWVlXxPE9dDy-u4zhvYeU9cKnBtrWdBD6pDkpa0klqdSAliPaxJBSBKeX0S9M_NSU6LVfPdejX732q0e_Q-zop6GfLcD-hX6FDsDZCMDwz5WHqFPjoWvA-rVObYP_v-EL9siN0Q</recordid><startdate>202408</startdate><enddate>202408</enddate><creator>Qin, Yanfei</creator><creator>Liu, Baolin</creator><general>Elsevier Ltd</general><scope>CGR</scope><scope>CUY</scope><scope>CVF</scope><scope>ECM</scope><scope>EIF</scope><scope>NPM</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7X8</scope><orcidid>https://orcid.org/0000-0003-0975-2316</orcidid></search><sort><creationdate>202408</creationdate><title>TFRS: A task-level feature rectification and separation method for few-shot video action recognition</title><author>Qin, Yanfei ; Liu, Baolin</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c311t-764b85556671277d3ff89a719352a0f9922aae85df06aa77f8e26ba635d7950f3</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Algorithms</topic><topic>Feature rectification</topic><topic>Feature separation</topic><topic>Few-shot video action recognition</topic><topic>Humans</topic><topic>Pattern Recognition, Automated - methods</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Qin, Yanfei</creatorcontrib><creatorcontrib>Liu, Baolin</creatorcontrib><collection>Medline</collection><collection>MEDLINE</collection><collection>MEDLINE (Ovid)</collection><collection>MEDLINE</collection><collection>MEDLINE</collection><collection>PubMed</collection><collection>CrossRef</collection><collection>MEDLINE - Academic</collection><jtitle>Neural networks</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Qin, Yanfei</au><au>Liu, Baolin</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>TFRS: A task-level feature rectification and separation method for few-shot video action recognition</atitle><jtitle>Neural networks</jtitle><addtitle>Neural Netw</addtitle><date>2024-08</date><risdate>2024</risdate><volume>176</volume><spage>106326</spage><pages>106326-</pages><artnum>106326</artnum><issn>0893-6080</issn><issn>1879-2782</issn><eissn>1879-2782</eissn><abstract>Few-shot video action recognition (FS-VAR) is a challenging task that requires models to have significant expressive power in order to identify previously unseen classes using only a few labeled examples. However, due to the limited number of support samples, the model’s performance is highly sensitive to the distribution of the sampled data. The representativeness of the support data is insufficient to cover the entire class, and the support features may contain shared information that confuses the classifier, leading to biased classification. In response to this difficulty, we present a task-level feature rectification and separation (TFRS) method that effectively resolves the sample bias issue. Our main idea is to leverage prior information from base classes to rectify the support samples while removing the commonality of task-level features. This enhances the distinguishability and separability of features in space. Furthermore, TFRS offers a straightforward yet versatile solution that can be seamlessly integrated into various established FS-VAR frameworks. Our design yields significant performance enhancements across various existing works by implementing TFRS, resulting in competitive outcomes on datasets such as UCF101, Kinetics, SSv2, and HMDB51. •We rectifies support sample by utilizing information from similar base prototypes.•We remove the projection onto the shared centroid.•Our method can be applied to various established FS-VAR frameworks.•We emphasize the importance of reducing the widely prevalent sample selection bias.</abstract><cop>United States</cop><pub>Elsevier Ltd</pub><pmid>38688066</pmid><doi>10.1016/j.neunet.2024.106326</doi><orcidid>https://orcid.org/0000-0003-0975-2316</orcidid></addata></record>
fulltext	fulltext
identifier	ISSN: 0893-6080
ispartof	Neural networks, 2024-08, Vol.176, p.106326, Article 106326
issn	0893-6080 1879-2782 1879-2782
language	eng
recordid	cdi_proquest_miscellaneous_3049717058
source	ScienceDirect Freedom Collection
subjects	Algorithms Feature rectification Feature separation Few-shot video action recognition Humans Pattern Recognition, Automated - methods
title	TFRS: A task-level feature rectification and separation method for few-shot video action recognition
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-13T21%3A31%3A16IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=TFRS:%20A%20task-level%20feature%20rectification%20and%20separation%20method%20for%20few-shot%20video%20action%20recognition&rft.jtitle=Neural%20networks&rft.au=Qin,%20Yanfei&rft.date=2024-08&rft.volume=176&rft.spage=106326&rft.pages=106326-&rft.artnum=106326&rft.issn=0893-6080&rft.eissn=1879-2782&rft_id=info:doi/10.1016/j.neunet.2024.106326&rft_dat=%3Cproquest_cross%3E3049717058%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c311t-764b85556671277d3ff89a719352a0f9922aae85df06aa77f8e26ba635d7950f3%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3049717058&rft_id=info:pmid/38688066&rfr_iscdi=true