Loading…
Improving real-time apple fruit detection: Multi-modal data and depth fusion with non-targeted background removal
In automated fruit detection, RGB-Depth (RGB-D) images aid the detection model with additional depth information to enhance detection accuracy. However, outdoor depth images are usually of low quality, which limits the quality of depth data. In this study, an approach/technique for real-time apple f...
Saved in:
Published in: | Ecological informatics 2024-09, Vol.82, p.102691, Article 102691 |
---|---|
Main Authors: | , , , , , , |
Format: | Article |
Language: | English |
Subjects: | |
Citations: | Items that this one cites |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In automated fruit detection, RGB-Depth (RGB-D) images aid the detection model with additional depth information to enhance detection accuracy. However, outdoor depth images are usually of low quality, which limits the quality of depth data. In this study, an approach/technique for real-time apple fruit detection in a high-density orchard environment by using multi-modal data is presented. Non-targeted background removal using the depth fusion (NBR-DF) method was developed to reduce the high noise condition of depth images. The noise occurred due to the uncontrolled lighting condition and holes with incomplete depth information in the depth images. NBR-DF technique follows three primary steps: pre-processing of depth images (point cloud generation), target object extraction, and background removal. The NBR-DF method serves as a pipeline to pre-process multi-modal data to enhance features of depth images by filling holes to eliminate noise generated by depth holes. Further, the NBR-DF implemented with the YOLOv5 enhances the detection accuracy in dense orchard conditions by using multi-modal information as input. An attention-based depth fusion module that adaptively fuses the multi-modal features was developed. The integration of the depth-attention matrix involved pooling operations and sigmoid normalization, both of which are efficient methods for summarizing and normalizing depth information. The fusion module improves the identification of multiscale objects and strengthens the network's resistance to noise. The network then detects the fruit position using multiscale information from the RGB-D images in highly complex orchard environments. The detection results were compared and validated with other methods using different input modals and fusion strategies. The results showed that the detection accuracy using the NBR-DF approach achieved an average precision rate of 0.964 in real time. The performance comparison with other state-of-the-art methods and the model generalization study also establish that the present advanced depth-fusion attention mechanism and effective preprocessing steps in NBR-DF-YOLOv5 significantly surpass those in performance. In conclusion, the developed NBR-DF technique showed the potential to improve real-time apple fruit detection using multi-modal information.
•The non-targeted background removal using depth fusion (NBR-DF) is developed to enhance apple fruit detection accuracy.•The NBR-DF used as pipeline with YOLOv5 detect |
---|---|
ISSN: | 1574-9541 |
DOI: | 10.1016/j.ecoinf.2024.102691 |