saliency map
Recently Published Documents


TOTAL DOCUMENTS

619
(FIVE YEARS 200)

H-INDEX

29
(FIVE YEARS 6)

2022 ◽  
Vol 2022 ◽  
pp. 1-14
Author(s):  
Liming Li ◽  
Shubin Zheng ◽  
Chenxi Wang ◽  
Shuguang Zhao ◽  
Xiaodong Chai ◽  
...  

This work presents a new method for sleeper crack identification based on cascade convolutional neural network (CNN) to address the problem of low efficiency and poor accuracy in the traditional detection method of sleeper crack identification. The proposed algorithm mainly includes improved You Only Look Once version 3 (YOLOv3) and the crack recognition network, where the crack recognition network includes two modules, the crack encoder-decoder network (CEDNet) and the crack residual refinement network (CRRNet). The improved YOLOv3 network is used to identify and locate cracks on sleepers and segment them after the sleeper on the ballast bed is extracted by using the gray projection method. The sleeper is inputted into CEDNet for crack feature extraction to predict the coarse crack saliency map. The prediction graph is inputted into CRRNet to improve its edge information and local region to achieve optimization. The accuracy of the crack identification model is improved by using a mixed loss function of binary cross-entropy (BCE), structural similarity index measure (SSIM), and intersection over union (IOU). Results show that this method can accurately detect the sleeper crack image. During object detection, the proposed method is compared with YOLOv3 in terms of directly locating sleeper cracks. It has an accuracy of 96.3%, a recall rate of 91.2%, a mean average precision (mAP) of 91.5%, and frames per second (FPS) of 76.6/s. In the crack extraction part, the F-weighted is 0.831, mean absolute error (MAE) is 0.0157, and area under the curve (AUC) is 0.9453. The proposed method has better recognition, higher efficiency, and robustness compared with the other network models.


2022 ◽  
Vol 14 (2) ◽  
pp. 283
Author(s):  
Biao Qi ◽  
Longxu Jin ◽  
Guoning Li ◽  
Yu Zhang ◽  
Qiang Li ◽  
...  

This study based on co-occurrence analysis shearlet transform (CAST) effectively combines the latent low rank representation (LatLRR) and the regularization of zero-crossing counting in differences to fuse the heterogeneous images. First, the source images are decomposed by CAST method into base-layer and detail-layer sub-images. Secondly, for the base-layer components with larger-scale intensity variation, the LatLRR, is a valid method to extract the salient information from image sources, and can be applied to generate saliency map to implement the weighted fusion of base-layer images adaptively. Meanwhile, the regularization term of zero crossings in differences, which is a classic method of optimization, is designed as the regularization term to construct the fusion of detail-layer images. By this method, the gradient information concealed in the source images can be extracted as much as possible, then the fusion image owns more abundant edge information. Compared with other state-of-the-art algorithms on publicly available datasets, the quantitative and qualitative analysis of experimental results demonstrate that the proposed method outperformed in enhancing the contrast and achieving close fusion result.


2022 ◽  
Vol 16 (5) ◽  
Author(s):  
Shiguang Liu ◽  
Ziqi Liu
Keyword(s):  

2022 ◽  
Vol 14 (1) ◽  
pp. 180
Author(s):  
Fang Zhou ◽  
Fengjie He ◽  
Changchun Gui ◽  
Zhangyu Dong ◽  
Mengdao Xing

A target detection method based on an improved single shot multibox detector (SSD) is proposed to solve insufficient training samples for synthetic aperture radar (SAR) target detection. We propose two strategies to improve the SSD: model structure optimization and small sample augmentation. For model structure optimization, the first approach is to extract deep features of the target with residual networks instead of with VGGNet. Then, the aspect ratios of the default boxes are redesigned to match the different targets’ sizes. For small sample augmentation, besides the routine image processing methods, such as rotating, translating, and mirroring, enough training samples are obtained based on the saliency map theory in machine vision. Lastly, a simulated SAR image dataset called Geometric Objects (GO) is constructed, which contains dihedral angles, surface plates and cylinders. The experimental results on the GO-simulated image dataset and the MSTAR real image dataset demonstrate that the proposed method has better performance in SAR target detection than other detection methods.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 40
Author(s):  
Chaowei Duan ◽  
Changda Xing ◽  
Yiliu Liu ◽  
Zhisheng Wang

As a powerful technique to merge complementary information of original images, infrared (IR) and visible image fusion approaches are widely used in surveillance, target detecting, tracking, and biological recognition, etc. In this paper, an efficient IR and visible image fusion method is proposed to simultaneously enhance the significant targets/regions in all source images and preserve rich background details in visible images. The multi-scale representation based on the fast global smoother is firstly used to decompose source images into the base and detail layers, aiming to extract the salient structure information and suppress the halos around the edges. Then, a target-enhanced parallel Gaussian fuzzy logic-based fusion rule is proposed to merge the base layers, which can avoid the brightness loss and highlight significant targets/regions. In addition, the visual saliency map-based fusion rule is designed to merge the detail layers with the purpose of obtaining rich details. Finally, the fused image is reconstructed. Extensive experiments are conducted on 21 image pairs and a Nato-camp sequence (32 image pairs) to verify the effectiveness and superiority of the proposed method. Compared with several state-of-the-art methods, experimental results demonstrate that the proposed method can achieve more competitive or superior performances according to both the visual results and objective evaluation.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8444
Author(s):  
Jaehyeop Choi ◽  
Chaehyeon Lee ◽  
Donggyu Lee ◽  
Heechul Jung

Modern data augmentation strategies such as Cutout, Mixup, and CutMix, have achieved good performance in image recognition tasks. Particularly, the data augmentation approaches, such as Mixup and CutMix, that mix two images to generate a mixed training image, could generalize convolutional neural networks better than single image-based data augmentation approaches such as Cutout. We focus on the fact that the mixed image can improve generalization ability, and we wondered if it would be effective to apply it to a single image. Consequently, we propose a new data augmentation method to produce a self-mixed image based on a saliency map, called SalfMix. Furthermore, we combined SalfMix with state-of-the-art two images-based approaches, such as Mixup, SaliencyMix, and CutMix, to increase the performance, called HybridMix. The proposed SalfMix achieved better accuracies than Cutout, and HybridMix achieved state-of-the-art performance on three classification datasets: CIFAR-10, CIFAR-100, and TinyImageNet-200. Furthermore, HybridMix achieved the best accuracy in object detection tasks on the VOC dataset, in terms of mean average precision.


2021 ◽  
Author(s):  
Fan Jiang ◽  
Phill Norlund

Abstract One of the major challenges in seismic imaging is accurately delineating subsurface salt. Since a salt boundary has strong impedance compared with other sediments, we build a saliency map with intensity and orientation to create a pixel-level model for salt interpretation. In this abstract, we train a saliency-map as an additional attribute to combine with the original seismic to predict salt bodies. We also train a saliency-map to classify multiple geological facies in a multi-channel convolutional neural network with residual net architecture to help build subsurface velocity models. Two examples are shown which demonstrate that a saliency-map-plus-seismic model successfully improves the accuracy of salt prediction and reduces artifacts.


Author(s):  
Huixin Yang ◽  
Xiang Li ◽  
Wei Zhang

Abstract Despite the rapid development of deep learning-based intelligent fault diagnosis methods on rotating machinery, the data-driven approach generally remains a "black box" to researchers, and its internal mechanism has not been sufficiently understood. The weak interpretability significantly impedes further development and applications of the effective deep neural network-based methods. This paper contributes efforts to understanding the mechanical signal processing of deep learning on the fault diagnosis problems. The diagnostic knowledge learned by the deep neural network is visualized using the neuron activation maximization and the saliency map methods. The discriminative features of different machine health conditions are intuitively observed. The relationship between the data-driven methods and the well-established conventional fault diagnosis knowledge is confirmed by the experimental investigations on two datasets. The results of this study can benefit researchers on understanding the complex neural networks, and increase the reliability of the data-driven fault diagnosis model in the real engineering cases.


2021 ◽  
Vol 11 (24) ◽  
pp. 11697
Author(s):  
Hamideh Kerdegari ◽  
Nhat Tran Huy Phung ◽  
Angela McBride ◽  
Luigi Pisani ◽  
Hao Van Nguyen ◽  
...  

The presence of B-line artefacts, the main artefact reflecting lung abnormalities in dengue patients, is often assessed using lung ultrasound (LUS) imaging. Inspired by human visual attention that enables us to process videos efficiently by paying attention to where and when it is required, we propose a spatiotemporal attention mechanism for B-line detection in LUS videos. The spatial attention allows the model to focus on the most task relevant parts of the image by learning a saliency map. The temporal attention generates an attention score for each attended frame to identify the most relevant frames from an input video. Our model not only identifies videos where B-lines show, but also localizes, within those videos, B-line related features both spatially and temporally, despite being trained in a weakly-supervised manner. We evaluate our approach on a LUS video dataset collected from severe dengue patients in a resource-limited hospital, assessing the B-line detection rate and the model’s ability to localize discriminative B-line regions spatially and B-line frames temporally. Experimental results demonstrate the efficacy of our approach for classifying B-line videos with an F1 score of up to 83.2% and localizing the most salient B-line regions both spatially and temporally with a correlation coefficient of 0.67 and an IoU of 69.7%, respectively.


Sign in / Sign up

Export Citation Format

Share Document