Loading…

RoI Fusion Strategy with Self-Attention Mechanism for Object Detection in Remote Sensing Images

In the field of remote sensing image (RSI) object detection, the oriented box annotation can accurately locate remote sensing objects with arbitrary orientation and obtain object orientation information. However, the detection based on oriented bounding box (OBB) is still a challenging task, and the...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal of selected topics in applied earth observations and remote sensing 2023-01, Vol.16, p.1-17
Main Authors: Zhang, Yuxi, Wang, Yongcheng, Zhang, Ning, Li, Zheng, Zhao, Zhikang, Gao, Yunxiao, Chen, Chi, Feng, Hao
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the field of remote sensing image (RSI) object detection, the oriented box annotation can accurately locate remote sensing objects with arbitrary orientation and obtain object orientation information. However, the detection based on oriented bounding box (OBB) is still a challenging task, and the detection effect needs to be improved. In RSI, the distribution of objects is extremely uneven, and the situation of object aggregation is easy to occur. Some researchers believe that the characteristics of dense distribution of objects are one of the reasons for the difficulty of object detection. However, the impact of dense distribution on detection performance has not been studied in depth. To address this problem, this paper proposes a dense object determination method based on oriented box annotation, which determines the dense objects in the dataset by two conditions consisting of inter-class distance, intra-class distance, minimum distance between objects, and minimum edge length of objects. The analysis of the experimental results of dense and non-dense object detection concludes that the characteristics of dense distribution in RSI do not easily cause the objects to be more difficult to detect. In order to make full use of the object features, this paper proposes a second stage detection head named region of interest fusion network (RoIF-Net), in which we extract region of interest (RoI) from the input image and fuse it with the RoI extracted from the feature maps to add detail features, and construct a feature induction module based on a self-attentive mechanism to achieve position regression and category classification. This structure can be used in any two-stage detection network to enhance detection capabilities. Using our method on three credible and challenging datasets, DOTA, DIOR-R and UCAS-AOD, we obtained 81.80%, 68.49% and 90.25% mAP results respectively, reaching SOTA based on OBB detection on these datasets, proving the effectiveness and advancement of our method.
ISSN:1939-1404
2151-1535
DOI:10.1109/JSTARS.2023.3289585