Loading…

AFRNet: adaptive feature refinement network

In the domain of computer vision, object detection is a fundamental task, aimed at accurately identifying and localizing objects of various sizes within images. While existing models such as You Only Look Once, Adaptive Training Sample Selection, and Task-aligned One-stage Object Detection have made...

Full description

Saved in:

Bibliographic Details
Published in:	Signal, image and video processing image and video processing, 2024-11, Vol.18 (11), p.7779-7788
Main Authors:	Zhang, Jilong, Yang, Yanjiao, Liu, Jienan, Jiang, Jing, Ma, Mei
Format:	Article
Language:	English
Subjects:	Adaptive sampling Computer Imaging Computer Science Computer vision Data integration Deformation effects Effectiveness Formability Geometric transformation Image enhancement Image Processing and Computer Vision Multimedia Information Systems Object recognition Original Paper Pattern Recognition and Graphics Signal,Image and Speech Processing Target detection Vision
Citations:	Items that this one cites
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

cited_by
cites	cdi_FETCH-LOGICAL-c200t-a7b90fcabfa3730f8c22d1771cab3b72d9454deb375338a45a1a19283fa74c633
container_end_page	7788
container_issue	11
container_start_page	7779
container_title	Signal, image and video processing
container_volume	18
creator	Zhang, Jilong Yang, Yanjiao Liu, Jienan Jiang, Jing Ma, Mei
description	In the domain of computer vision, object detection is a fundamental task, aimed at accurately identifying and localizing objects of various sizes within images. While existing models such as You Only Look Once, Adaptive Training Sample Selection, and Task-aligned One-stage Object Detection have made breakthroughs in this field, they still exhibit deficiencies in information fusion within their neck structure. To overcome these limitations, we have designed an innovative model architecture known as Adaptive Feature Refinement Network (AFRNet). The model, on one hand, discards the conventional Feature Pyramid Network structure and designs a novel neck structure that incorporates the structures of Scale Sequence Feature Fusion (SSFF) model and the Gather-and-Distribute (GD) mechanism. Through experimentation, it has been demonstrated that the SSFF method can further enhance the multi-scale feature fusion of the GD mechanism, thereby improving the performance of the target detection task. On the other hand, to address the constraints of existing models in simulating geometric transformations, We have designed an advanced variable convolution structure called Attentive Deformable ConvNet. This structure integrates an improved attention mechanism, which allows for more precise capture of key features in images. Extensive experiments conducted on the MS-COCO dataset have validated the effectiveness of our model. In single-model, single-scale testing, our model achieved an Average Precision (AP) of 51.8%, a result that underscores a significant enhancement in object detection performance and confirms the efficacy of our model.
doi_str_mv	10.1007/s11760-024-03427-3
format	article
fullrecord	<record><control><sourceid>proquest_cross</sourceid><recordid>TN_cdi_proquest_journals_3104475851</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><sourcerecordid>3104475851</sourcerecordid><originalsourceid>FETCH-LOGICAL-c200t-a7b90fcabfa3730f8c22d1771cab3b72d9454deb375338a45a1a19283fa74c633</originalsourceid><addsrcrecordid>eNp9kE9LAzEQxYMoWGq_gKcFjxKdyWQ3qbdS_AdFQfQcsruJtNrdmqSK397UFb05lxmG994MP8aOEc4QQJ1HRFUBByE5kBSK0x4boa6Io0Lc_52BDtkkxhXkIqF0pUfsdHb1cOfSRWFbu0nLd1d4Z9M2uCI4v-zc2nWp6Fz66MPLETvw9jW6yU8fs6ery8f5DV_cX9_OZwveCIDEraqn4Btbe0uKwOtGiBaVwryiWol2KkvZuppUSaStLC1anApN3irZVERjdjLkbkL_tnUxmVW_DV0-aQhBSlXqErNKDKom9DHmb80mLNc2fBoEs-NiBi4mczHfXMwumgZTzOLu2YW_6H9cXzUwY5I</addsrcrecordid><sourcetype>Aggregation Database</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>3104475851</pqid></control><display><type>article</type><title>AFRNet: adaptive feature refinement network</title><source>Springer Nature</source><creator>Zhang, Jilong ; Yang, Yanjiao ; Liu, Jienan ; Jiang, Jing ; Ma, Mei</creator><creatorcontrib>Zhang, Jilong ; Yang, Yanjiao ; Liu, Jienan ; Jiang, Jing ; Ma, Mei</creatorcontrib><description>In the domain of computer vision, object detection is a fundamental task, aimed at accurately identifying and localizing objects of various sizes within images. While existing models such as You Only Look Once, Adaptive Training Sample Selection, and Task-aligned One-stage Object Detection have made breakthroughs in this field, they still exhibit deficiencies in information fusion within their neck structure. To overcome these limitations, we have designed an innovative model architecture known as Adaptive Feature Refinement Network (AFRNet). The model, on one hand, discards the conventional Feature Pyramid Network structure and designs a novel neck structure that incorporates the structures of Scale Sequence Feature Fusion (SSFF) model and the Gather-and-Distribute (GD) mechanism. Through experimentation, it has been demonstrated that the SSFF method can further enhance the multi-scale feature fusion of the GD mechanism, thereby improving the performance of the target detection task. On the other hand, to address the constraints of existing models in simulating geometric transformations, We have designed an advanced variable convolution structure called Attentive Deformable ConvNet. This structure integrates an improved attention mechanism, which allows for more precise capture of key features in images. Extensive experiments conducted on the MS-COCO dataset have validated the effectiveness of our model. In single-model, single-scale testing, our model achieved an Average Precision (AP) of 51.8%, a result that underscores a significant enhancement in object detection performance and confirms the efficacy of our model.</description><identifier>ISSN: 1863-1703</identifier><identifier>EISSN: 1863-1711</identifier><identifier>DOI: 10.1007/s11760-024-03427-3</identifier><language>eng</language><publisher>London: Springer London</publisher><subject>Adaptive sampling ; Computer Imaging ; Computer Science ; Computer vision ; Data integration ; Deformation effects ; Effectiveness ; Formability ; Geometric transformation ; Image enhancement ; Image Processing and Computer Vision ; Multimedia Information Systems ; Object recognition ; Original Paper ; Pattern Recognition and Graphics ; Signal,Image and Speech Processing ; Target detection ; Vision</subject><ispartof>Signal, image and video processing, 2024-11, Vol.18 (11), p.7779-7788</ispartof><rights>The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.</rights><lds50>peer_reviewed</lds50><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c200t-a7b90fcabfa3730f8c22d1771cab3b72d9454deb375338a45a1a19283fa74c633</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,780,784,27924,27925</link.rule.ids></links><search><creatorcontrib>Zhang, Jilong</creatorcontrib><creatorcontrib>Yang, Yanjiao</creatorcontrib><creatorcontrib>Liu, Jienan</creatorcontrib><creatorcontrib>Jiang, Jing</creatorcontrib><creatorcontrib>Ma, Mei</creatorcontrib><title>AFRNet: adaptive feature refinement network</title><title>Signal, image and video processing</title><addtitle>SIViP</addtitle><description>In the domain of computer vision, object detection is a fundamental task, aimed at accurately identifying and localizing objects of various sizes within images. While existing models such as You Only Look Once, Adaptive Training Sample Selection, and Task-aligned One-stage Object Detection have made breakthroughs in this field, they still exhibit deficiencies in information fusion within their neck structure. To overcome these limitations, we have designed an innovative model architecture known as Adaptive Feature Refinement Network (AFRNet). The model, on one hand, discards the conventional Feature Pyramid Network structure and designs a novel neck structure that incorporates the structures of Scale Sequence Feature Fusion (SSFF) model and the Gather-and-Distribute (GD) mechanism. Through experimentation, it has been demonstrated that the SSFF method can further enhance the multi-scale feature fusion of the GD mechanism, thereby improving the performance of the target detection task. On the other hand, to address the constraints of existing models in simulating geometric transformations, We have designed an advanced variable convolution structure called Attentive Deformable ConvNet. This structure integrates an improved attention mechanism, which allows for more precise capture of key features in images. Extensive experiments conducted on the MS-COCO dataset have validated the effectiveness of our model. In single-model, single-scale testing, our model achieved an Average Precision (AP) of 51.8%, a result that underscores a significant enhancement in object detection performance and confirms the efficacy of our model.</description><subject>Adaptive sampling</subject><subject>Computer Imaging</subject><subject>Computer Science</subject><subject>Computer vision</subject><subject>Data integration</subject><subject>Deformation effects</subject><subject>Effectiveness</subject><subject>Formability</subject><subject>Geometric transformation</subject><subject>Image enhancement</subject><subject>Image Processing and Computer Vision</subject><subject>Multimedia Information Systems</subject><subject>Object recognition</subject><subject>Original Paper</subject><subject>Pattern Recognition and Graphics</subject><subject>Signal,Image and Speech Processing</subject><subject>Target detection</subject><subject>Vision</subject><issn>1863-1703</issn><issn>1863-1711</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><recordid>eNp9kE9LAzEQxYMoWGq_gKcFjxKdyWQ3qbdS_AdFQfQcsruJtNrdmqSK397UFb05lxmG994MP8aOEc4QQJ1HRFUBByE5kBSK0x4boa6Io0Lc_52BDtkkxhXkIqF0pUfsdHb1cOfSRWFbu0nLd1d4Z9M2uCI4v-zc2nWp6Fz66MPLETvw9jW6yU8fs6ery8f5DV_cX9_OZwveCIDEraqn4Btbe0uKwOtGiBaVwryiWol2KkvZuppUSaStLC1anApN3irZVERjdjLkbkL_tnUxmVW_DV0-aQhBSlXqErNKDKom9DHmb80mLNc2fBoEs-NiBi4mczHfXMwumgZTzOLu2YW_6H9cXzUwY5I</recordid><startdate>20241101</startdate><enddate>20241101</enddate><creator>Zhang, Jilong</creator><creator>Yang, Yanjiao</creator><creator>Liu, Jienan</creator><creator>Jiang, Jing</creator><creator>Ma, Mei</creator><general>Springer London</general><general>Springer Nature B.V</general><scope>AAYXX</scope><scope>CITATION</scope></search><sort><creationdate>20241101</creationdate><title>AFRNet: adaptive feature refinement network</title><author>Zhang, Jilong ; Yang, Yanjiao ; Liu, Jienan ; Jiang, Jing ; Ma, Mei</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c200t-a7b90fcabfa3730f8c22d1771cab3b72d9454deb375338a45a1a19283fa74c633</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>Adaptive sampling</topic><topic>Computer Imaging</topic><topic>Computer Science</topic><topic>Computer vision</topic><topic>Data integration</topic><topic>Deformation effects</topic><topic>Effectiveness</topic><topic>Formability</topic><topic>Geometric transformation</topic><topic>Image enhancement</topic><topic>Image Processing and Computer Vision</topic><topic>Multimedia Information Systems</topic><topic>Object recognition</topic><topic>Original Paper</topic><topic>Pattern Recognition and Graphics</topic><topic>Signal,Image and Speech Processing</topic><topic>Target detection</topic><topic>Vision</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Zhang, Jilong</creatorcontrib><creatorcontrib>Yang, Yanjiao</creatorcontrib><creatorcontrib>Liu, Jienan</creatorcontrib><creatorcontrib>Jiang, Jing</creatorcontrib><creatorcontrib>Ma, Mei</creatorcontrib><collection>CrossRef</collection><jtitle>Signal, image and video processing</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Zhang, Jilong</au><au>Yang, Yanjiao</au><au>Liu, Jienan</au><au>Jiang, Jing</au><au>Ma, Mei</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>AFRNet: adaptive feature refinement network</atitle><jtitle>Signal, image and video processing</jtitle><stitle>SIViP</stitle><date>2024-11-01</date><risdate>2024</risdate><volume>18</volume><issue>11</issue><spage>7779</spage><epage>7788</epage><pages>7779-7788</pages><issn>1863-1703</issn><eissn>1863-1711</eissn><abstract>In the domain of computer vision, object detection is a fundamental task, aimed at accurately identifying and localizing objects of various sizes within images. While existing models such as You Only Look Once, Adaptive Training Sample Selection, and Task-aligned One-stage Object Detection have made breakthroughs in this field, they still exhibit deficiencies in information fusion within their neck structure. To overcome these limitations, we have designed an innovative model architecture known as Adaptive Feature Refinement Network (AFRNet). The model, on one hand, discards the conventional Feature Pyramid Network structure and designs a novel neck structure that incorporates the structures of Scale Sequence Feature Fusion (SSFF) model and the Gather-and-Distribute (GD) mechanism. Through experimentation, it has been demonstrated that the SSFF method can further enhance the multi-scale feature fusion of the GD mechanism, thereby improving the performance of the target detection task. On the other hand, to address the constraints of existing models in simulating geometric transformations, We have designed an advanced variable convolution structure called Attentive Deformable ConvNet. This structure integrates an improved attention mechanism, which allows for more precise capture of key features in images. Extensive experiments conducted on the MS-COCO dataset have validated the effectiveness of our model. In single-model, single-scale testing, our model achieved an Average Precision (AP) of 51.8%, a result that underscores a significant enhancement in object detection performance and confirms the efficacy of our model.</abstract><cop>London</cop><pub>Springer London</pub><doi>10.1007/s11760-024-03427-3</doi><tpages>10</tpages></addata></record>
fulltext	fulltext
identifier	ISSN: 1863-1703
ispartof	Signal, image and video processing, 2024-11, Vol.18 (11), p.7779-7788
issn	1863-1703 1863-1711
language	eng
recordid	cdi_proquest_journals_3104475851
source	Springer Nature
subjects	Adaptive sampling Computer Imaging Computer Science Computer vision Data integration Deformation effects Effectiveness Formability Geometric transformation Image enhancement Image Processing and Computer Vision Multimedia Information Systems Object recognition Original Paper Pattern Recognition and Graphics Signal,Image and Speech Processing Target detection Vision
title	AFRNet: adaptive feature refinement network
url	http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-07T20%3A51%3A48IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=AFRNet:%20adaptive%20feature%20refinement%20network&rft.jtitle=Signal,%20image%20and%20video%20processing&rft.au=Zhang,%20Jilong&rft.date=2024-11-01&rft.volume=18&rft.issue=11&rft.spage=7779&rft.epage=7788&rft.pages=7779-7788&rft.issn=1863-1703&rft.eissn=1863-1711&rft_id=info:doi/10.1007/s11760-024-03427-3&rft_dat=%3Cproquest_cross%3E3104475851%3C/proquest_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c200t-a7b90fcabfa3730f8c22d1771cab3b72d9454deb375338a45a1a19283fa74c633%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=3104475851&rft_id=info:pmid/&rfr_iscdi=true