Loading…
BANet: Small and multi-object detection with a bidirectional attention network for traffic scenes
Improving the detection accuracy and speed for small and multi-object detection is a hot issue in traffic environments. Despite the substantial advances in object detection algorithms based on deep neural networks, addressing the inaccuracy and low efficiency of small and multi-object detection rema...
Saved in:
Published in: | Engineering applications of artificial intelligence 2023-01, Vol.117, p.105504, Article 105504 |
---|---|
Main Authors: | , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites Items that cite this one |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Improving the detection accuracy and speed for small and multi-object detection is a hot issue in traffic environments. Despite the substantial advances in object detection algorithms based on deep neural networks, addressing the inaccuracy and low efficiency of small and multi-object detection remains challenging. In this paper, we propose a bidirectional attention network called BANet, which includes multichannel attention (MCA) blocks, alpha-effective intersection-over-union (α-EIoU) loss, and a multiple attention fusion (MAF) module. Each MCA block consists of low-layer, medium-layer, and high-layer features to provide rich base information for feature fusion at the neck module. We introduce MAF to alleviate the spatial location loss and poor semantic performance resulting from the continuous downsampling of the path aggregation feature pyramid network (PAFPNet). Finally, α-EIoU is our regression loss module, which calculates the difference between the predicted box and the ground truth (gt) box. Our study further demonstrates that these strategies yield significant improvements in performance over some existing YOLO detectors. Compared with the performance of YOLOX, BANet demonstrates 0.39%–0.52% mAP@0.5 improvement on the PASCAL VOC 2007 (VOC 07) dataset and 0.55%–2.93% mAP@0.5 improvement on the PASCAL VOC 2012 (VOC 12) dataset. Additionally, 0.3%–1.01% improvement in the mAP@0.5 is achieved on the MS COCO 2017 (COCO 17) dataset, indicating that BANet has a significant effect on multi-object detection. Experiments to determine the approximate number of parameters with YOLOX, show that our strategy not only improves by 7.5 frames per second (FPS) but also reduces the Average forward time by 0.97 ms. |
---|---|
ISSN: | 0952-1976 1873-6769 |
DOI: | 10.1016/j.engappai.2022.105504 |