Loading…

Self-Selection Salient Region-Based Scene Recognition Using Slight-Weight Convolutional Neural Network

Visual scene recognition is an indispensable part of automatic localization and navigation. In the same scene, the appearance and viewpoint may be changed greatly, which is the largest challenge for some advanced unmanned systems,e.g. robot,vehicle and UAV,etc., to identify scenes where they have vi...

Full description

Saved in:

Bibliographic Details
Published in:	Journal of intelligent & robotic systems 2021-07, Vol.102 (3), Article 58
Main Authors:	Li, Zhenyu, Zhou, Aiguo
Format:	Article
Language:	English
Subjects:	Artificial Intelligence Artificial neural networks Computer architecture Control Convolution Electrical Engineering Engineering Feature extraction Mechanical Engineering Mechatronics Neural networks Object recognition Rankings Regular Paper Representations Robotics Robots Robustness Similarity Statistical methods Weight
Citations:	Items that this one cites Items that cite this one
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Visual scene recognition is an indispensable part of automatic localization and navigation. In the same scene, the appearance and viewpoint may be changed greatly, which is the largest challenge for some advanced unmanned systems,e.g. robot,vehicle and UAV,etc., to identify scenes where they have visited. Traditional methods have been subjected to hand-made feature-based paradigms for a long time, mainly relying on the prior knowledge of the designer, and are not sufficiently robust to extreme changing scenes. In this paper, we cope with scene recognition with automatically learning the representation of features from big image samples. Firstly, we propose a novel approach for scene recognition via training a slight-weight convolutional neural network (CNN) that overall has less complex and more efficient network architecture, and is trainable in the manner of end-to-end. The proposed approach uses the deep-leaning features of self-selection combining with light CNN process to perform high semantic understanding of visual scenes. Secondly, we propose to employ a salient region-based technology to extract the local feature representation of a specific scene region directly from the convolution layer based on self-selection mechanism, and each layer performs a linear operation with end-to-end manner. Furthermore, we also utilize probability statistics to calculate the total similarity of several regions in one scene to other regions, and finally rank the similarity scores to select the correct scene. We have conducted a lot of experiments to evaluate the results of performance by comparing four methods (namely, our proposed and other three well known and advanced methods). Experimental results show that the proposed method is more robust and accurate than other three well-known methods in extremely harsh environments (e. g. weak light and strong blur).
ISSN:	0921-0296 1573-0409
DOI:	10.1007/s10846-021-01421-2