scholarly journals Loop Closure Detection Based on Multi-Scale Deep Feature Fusion

2019 ◽  
Vol 9 (6) ◽  
pp. 1120 ◽  
Author(s):  
Baifan Chen ◽  
Dian Yuan ◽  
Chunfa Liu ◽  
Qian Wu

Loop closure detection plays a very important role in the mobile robot navigation field. It is useful in achieving accurate navigation in complex environments and reducing the cumulative error of the robot’s pose estimation. The current mainstream methods are based on the visual bag of word model, but traditional image features are sensitive to illumination changes. This paper proposes a loop closure detection algorithm based on multi-scale deep feature fusion, which uses a Convolutional Neural Network (CNN) to extract more advanced and more abstract features. In order to deal with the different sizes of input images and enrich receptive fields of the feature extractor, this paper uses the spatial pyramid pooling (SPP) of multi-scale to fuse the features. In addition, considering the different contributions of each feature to loop closure detection, the paper defines the distinguishability weight of features and uses it in similarity measurement. It reduces the probability of false positives in loop closure detection. The experimental results show that the loop closure detection algorithm based on multi-scale deep feature fusion has higher precision and recall rates and is more robust to illumination changes than the mainstream methods.

Author(s):  
Yan Deli ◽  
Tuo Wenkun ◽  
Wang Weiming ◽  
Li Shaohua

Background: Loop closure detection is a crucial part in robot navigation and simultaneous location and mapping (SLAM). Appearance-based loop closure detection still faces many challenges, such as illumination changes, perceptual aliasing and increasing computational complexity. Methods: In this paper, we proposed a visual loop closure detection algorithm that combines illumination robust descriptor DIRD and odometry information. In this algorithm, a new distance function is built by fusing the Euclidean distance function and Mahalanobis distance function, which integrates the pose uncertainty of body and can dynamically adjust the threshold of potential loop closure locations. Then, potential locations are verified by calculating the similarity of DIRD descriptors. Results: The proposed algorithm is evaluated on KITTI and EuRoC datasets, and is compared with SeqSLAM algorithm, which is one of the state of the art loop closure detection algorithms. The results show that the proposed algorithm could effectively reduce the computing time and get better performance on P-R curve. Conclusion: The new loop closure detection method makes full use of odometry information and image appearance information. The application of the new distance function can effectively reduce the missed detection caused by odometry error accumulation. The algorithm does not require extracting image features or learning stage, and can realize real-time detection and run on the platform with limited computational power.


2021 ◽  
Vol 13 (2) ◽  
pp. 328
Author(s):  
Wenkai Liang ◽  
Yan Wu ◽  
Ming Li ◽  
Yice Cao ◽  
Xin Hu

The classification of high-resolution (HR) synthetic aperture radar (SAR) images is of great importance for SAR scene interpretation and application. However, the presence of intricate spatial structural patterns and complex statistical nature makes SAR image classification a challenging task, especially in the case of limited labeled SAR data. This paper proposes a novel HR SAR image classification method, using a multi-scale deep feature fusion network and covariance pooling manifold network (MFFN-CPMN). MFFN-CPMN combines the advantages of local spatial features and global statistical properties and considers the multi-feature information fusion of SAR images in representation learning. First, we propose a Gabor-filtering-based multi-scale feature fusion network (MFFN) to capture the spatial pattern and get the discriminative features of SAR images. The MFFN belongs to a deep convolutional neural network (CNN). To make full use of a large amount of unlabeled data, the weights of each layer of MFFN are optimized by unsupervised denoising dual-sparse encoder. Moreover, the feature fusion strategy in MFFN can effectively exploit the complementary information between different levels and different scales. Second, we utilize a covariance pooling manifold network to extract further the global second-order statistics of SAR images over the fusional feature maps. Finally, the obtained covariance descriptor is more distinct for various land covers. Experimental results on four HR SAR images demonstrate the effectiveness of the proposed method and achieve promising results over other related algorithms.


2021 ◽  
Vol 2078 (1) ◽  
pp. 012008
Author(s):  
Hui Liu ◽  
Keyang Cheng

Abstract Aiming at the problem of false detection and missed detection of small targets and occluded targets in the process of pedestrian detection, a pedestrian detection algorithm based on improved multi-scale feature fusion is proposed. First, for the YOLOv4 multi-scale feature fusion module PANet, which does not consider the interaction relationship between scales, PANet is improved to reduce the semantic gap between scales, and the attention mechanism is introduced to learn the importance of different layers to strengthen feature fusion; then, dilated convolution is introduced. Dilated convolution reduces the problem of information loss during the downsampling process; finally, the K-means clustering algorithm is used to redesign the anchor box and modify the loss function to detect a single category. The experimental results show that the improved pedestrian detection algorithm in the INRIA and WiderPerson data sets under different congestion conditions, the AP reaches 96.83% and 59.67%, respectively. Compared with the pedestrian detection results of the YOLOv4 model, the algorithm improves by 2.41% and 1.03%, respectively. The problem of false detection and missed detection of small targets and occlusion has been significantly improved.


2018 ◽  
Vol 55 (11) ◽  
pp. 111507
Author(s):  
鲍振强 Bao Zhenqiang ◽  
李艾华 Li Aihua ◽  
崔智高 Cui Zhigao ◽  
苏延召 Su Yanzhao ◽  
郑勇 Zheng Yong

Entropy ◽  
2019 ◽  
Vol 21 (6) ◽  
pp. 622 ◽  
Author(s):  
Xiaoyang Liu ◽  
Wei Jing ◽  
Mingxuan Zhou ◽  
Yuxing Li

Automatic coal-rock recognition is one of the critical technologies for intelligent coal mining and processing. Most existing coal-rock recognition methods have some defects, such as unsatisfactory performance and low robustness. To solve these problems, and taking distinctive visual features of coal and rock into consideration, the multi-scale feature fusion coal-rock recognition (MFFCRR) model based on a multi-scale Completed Local Binary Pattern (CLBP) and a Convolution Neural Network (CNN) is proposed in this paper. Firstly, the multi-scale CLBP features are extracted from coal-rock image samples in the Texture Feature Extraction (TFE) sub-model, which represents texture information of the coal-rock image. Secondly, the high-level deep features are extracted from coal-rock image samples in the Deep Feature Extraction (DFE) sub-model, which represents macroscopic information of the coal-rock image. The texture information and macroscopic information are acquired based on information theory. Thirdly, the multi-scale feature vector is generated by fusing the multi-scale CLBP feature vector and deep feature vector. Finally, multi-scale feature vectors are input to the nearest neighbor classifier with the chi-square distance to realize coal-rock recognition. Experimental results show the coal-rock image recognition accuracy of the proposed MFFCRR model reaches 97.9167%, which increased by 2%–3% compared with state-of-the-art coal-rock recognition methods.


Author(s):  
Xiaotian Chen ◽  
Xuejin Chen ◽  
Zheng-Jun Zha

Monocular depth estimation is an essential task for scene understanding. The underlying structure of objects and stuff in a complex scene is critical to recovering accurate and visually-pleasing depth maps. Global structure conveys scene layouts, while local structure reflects shape details. Recently developed approaches based on convolutional neural networks (CNNs) significantly improve the performance of depth estimation. However, few of them take into account multi-scale structures in complex scenes. In this paper, we propose a Structure-Aware Residual Pyramid Network (SARPN) to exploit multi-scale structures for accurate depth prediction. We propose a Residual Pyramid Decoder (RPD) which expresses global scene structure in upper levels to represent layouts, and local structure in lower levels to present shape details. At each level, we propose Residual Refinement Modules (RRM) that predict residual maps to progressively add finer structures on the coarser structure predicted at the upper level. In order to fully exploit multi-scale image features, an Adaptive Dense Feature Fusion (ADFF) module, which adaptively fuses effective features from all scales for inferring structures of each scale, is introduced. Experiment results on the challenging NYU-Depth v2 dataset demonstrate that our proposed approach achieves state-of-the-art performance in both qualitative and quantitative evaluation. The code is available at https://github.com/Xt-Chen/SARPN.


2021 ◽  
Vol 13 (17) ◽  
pp. 3520
Author(s):  
Zhian Yuan ◽  
Ke Xu ◽  
Xiaoyu Zhou ◽  
Bin Deng ◽  
Yanxin Ma

Loop closure detection is an important component of visual simultaneous localization and mapping (SLAM). However, most existing loop closure detection methods are vulnerable to complex environments and use limited information from images. As higher-level image information and multi-information fusion can improve the robustness of place recognition, a semantic–visual–geometric information-based loop closure detection algorithm (SVG-Loop) is proposed in this paper. In detail, to reduce the interference of dynamic features, a semantic bag-of-words model was firstly constructed by connecting visual features with semantic labels. Secondly, in order to improve detection robustness in different scenes, a semantic landmark vector model was designed by encoding the geometric relationship of the semantic graph. Finally, semantic, visual, and geometric information was integrated by fuse calculation of the two modules. Compared with art-of-the-state methods, experiments on the TUM RBG-D dataset, KITTI odometry dataset, and practical environment show that SVG-Loop has advantages in complex environments with varying light, changeable weather, and dynamic interference.


2019 ◽  
Vol 2019 ◽  
pp. 1-11 ◽  
Author(s):  
Junping Hu ◽  
Shitu Abubakar ◽  
Shengjun Liu ◽  
Xiaobiao Dai ◽  
Gen Yang ◽  
...  

Pedestrians, motorist, and cyclist remain the victims of poor vision and negligence of human drivers, especially in the night. Millions of people die or sustain physical injury yearly as a result of traffic accidents. Detection and recognition of road markings play a vital role in many applications such as traffic surveillance and autonomous driving. In this study, we have trained a nighttime road-marking detection model using NIR camera images. We have modified the VGG-16 base network of the state-of-the-art faster R-CNN algorithm by using a multilayer feature fusion technique. We have demonstrated another promising feature fusion technique of concatenating all the convolutional layers within a stage to extract image features. The modification boosts the overall detection performance of the model by utilizing the advantages of the shallow layers and the deep layers of the VGG-16 network. The training samples were augmented using random rotation and translation to enhance the heterogeneity of the detection algorithm. We have achieved a mean average precision (mAP) of 89.48% and 92.83% for the baseline faster R-CNN and our modified method, respectively.


Sign in / Sign up

Export Citation Format

Share Document