scholarly journals Road crack detection network under noise based on feature pyramid structure with feature enhancement (road crack detection under noise)

2021 ◽  
Author(s):  
Mingsi Sun ◽  
Hongwei Zhao ◽  
Jiao Li
2020 ◽  
Vol 10 (7) ◽  
pp. 2528 ◽  
Author(s):  
Lu Deng ◽  
Hong-Hu Chu ◽  
Peng Shi ◽  
Wei Wang ◽  
Xuan Kong

Cracks are often the most intuitive indicators for assessing the condition of in-service structures. Intelligent detection methods based on regular convolutional neural networks (CNNs) have been widely applied to the field of crack detection in recently years; however, these methods exhibit unsatisfying performance on the detection of out-of-plane cracks. To overcome this drawback, a new type of region-based CNN (R-CNN) crack detector with deformable modules is proposed in the present study. The core idea of the method is to replace the traditional regular convolution and pooling operation with a deformable convolution operation and a deformable pooling operation. The idea is implemented on three different regular detectors, namely the Faster R-CNN, region-based fully convolutional networks (R-FCN), and feature pyramid network (FPN)-based Faster R-CNN. To examine the advantages of the proposed method, the results obtained from the proposed detector and corresponding regular detectors are compared. The results show that the addition of deformable modules improves the mean average precisions (mAPs) achieved by the Faster R-CNN, R-FCN, and FPN-based Faster R-CNN for crack detection. More importantly, adding deformable modules enables these detectors to detect the out-of-plane cracks that are difficult for regular detectors to detect.


Author(s):  
Binglin Niu ◽  
Mengxia Tang ◽  
Xuelin Chen

Perceiving the three-dimensional structure of the surrounding environment and analyzing it for autonomous movement is an indispensable element for robots to operate in scenes. Recovering depth information and the three-dimensional spatial structure from monocular images is a basic mission of computer vision. For the objects in the image, there are many scenes that may produce it. This paper proposes to use a supervised end-to-end network to perform depth estimation without relying on any subsequent processing operations, such as probabilistic graphic models and other extra fine steps. This paper uses an encoder-decoder structure with feature pyramid to complete the prediction of dense depth maps. The encoder adopts ResNeXt-50 network to achieve main features from the original image. The feature pyramid structure can merge high and low level information with each other, and the feature information is not lost. The decoder utilizes the transposed convolutional and the convolutional layer to connect as an up-sampling structure to expand the resolution of the output. The structure adopted in this paper is applied to the indoor dataset NYU Depth v2 to obtain better prediction results than other methods. The experimental results show that on the NYU Depth v2 dataset, our method achieves the best results on 5 indicators and the sub-optimal results on 1 indicator.


Author(s):  
Yongtao Yu ◽  
Haiyan Guan ◽  
Dilong Li ◽  
Yongjun Zhang ◽  
Shenghua Jin ◽  
...  

2019 ◽  
Vol 9 (2) ◽  
pp. 315 ◽  
Author(s):  
Junhwan Ryu ◽  
Sungho Kim

This paper proposes a deep learning-based Chinese character detection network which is important for character recognition and translation. Detecting the correct character area is an important part of recognition and translation. Previous studies have focused on methods using projection through image pre-processing and recognition methods based on segmentation and methods using hand-crafted features such as analyzing and using features. Unfortunately, the results are vulnerable to noise. Recently, recognition or translation systems based on deep learning were dealt with as a single step from detection to translation but they failed to consider the inaccurate localization problem that arises in detectors. This paper proposes a Chinese character boxes (CCB) network that deals with a method to detect the character area more accurately using the single-shot multibox detector (SSD) as the baseline and called CCB-SSD. The proposed CCB-SSD network has a single prediction layer structure in which unnecessary layers are removed from the feature-pyramid structure. The augmentation method for training is introduced and the problem caused by the use of default boxes is solved by using the proposed non-maximum suppression (NMS). The experimental results revealed a 96.1% detection rate and 0.89 performance against the false positives per character (FPPC) which is the proposed false positive index for the character data-set and caoshu data-set used in this paper. This method showed better performance than the conventional SSD with 69.4% and 6.57 FPPC.


2021 ◽  
pp. 1-11
Author(s):  
Weiming He ◽  
You Wu ◽  
Jing Xiao ◽  
Yang Cao

Feature pyramids are commonly applied to solve the scale variation problem for object detection. One of the most representative works of feature pyramid is Feature Pyramid Network (FPN), which is simple and efficient. However, the fully power of multi-scale features might not be completely exploited in FPN due to its design defects. In this paper, we first analyze the structure problems of FPN which prevent the multi-scale feature from being fully exploited, then propose a new feature pyramid structure named Mixed Group FPN (MGFPN), to mitigate these design defects of FPN. Concretely, MGFPN strengthens the feature utilization by two modules named Mixed Group Convolution(MGConv) and Contextual Attention(CA). MGConv reduces the spatial information loss of FPN in feature generation stage. And CA narrows the semantic gaps between features of different receptive field before lateral summation. By replacing FPN with MGFPN in FCOS, our method can improve the performance of detectors in many major backbones by 0.7 to 1.2 Average Precision(AP) on MS-COCO benchmark without adding too much parameters and it is easy to be extended to other FPN-based models. The proposed MGFPN can serve as a simple and strong alternative for many other FPN based models.


2021 ◽  
Vol 8 ◽  
Author(s):  
Yunzhu Wu ◽  
Ruoxin Zhang ◽  
Lei Zhu ◽  
Weiming Wang ◽  
Shengwen Wang ◽  
...  

Automatic and accurate segmentation of breast lesion regions from ultrasonography is an essential step for ultrasound-guided diagnosis and treatment. However, developing a desirable segmentation method is very difficult due to strong imaging artifacts e.g., speckle noise, low contrast and intensity inhomogeneity, in breast ultrasound images. To solve this problem, this paper proposes a novel boundary-guided multiscale network (BGM-Net) to boost the performance of breast lesion segmentation from ultrasound images based on the feature pyramid network (FPN). First, we develop a boundary-guided feature enhancement (BGFE) module to enhance the feature map for each FPN layer by learning a boundary map of breast lesion regions. The BGFE module improves the boundary detection capability of the FPN framework so that weak boundaries in ambiguous regions can be correctly identified. Second, we design a multiscale scheme to leverage the information from different image scales in order to tackle ultrasound artifacts. Specifically, we downsample each testing image into a coarse counterpart, and both the testing image and its coarse counterpart are input into BGM-Net to predict a fine and a coarse segmentation maps, respectively. The segmentation result is then produced by fusing the fine and the coarse segmentation maps so that breast lesion regions are accurately segmented from ultrasound images and false detections are effectively removed attributing to boundary feature enhancement and multiscale image information. We validate the performance of the proposed approach on two challenging breast ultrasound datasets, and experimental results demonstrate that our approach outperforms state-of-the-art methods.


2021 ◽  
Vol 11 (24) ◽  
pp. 11630
Author(s):  
Yan Zhou ◽  
Sijie Wen ◽  
Dongli Wang ◽  
Jinzhen Mu ◽  
Irampaye Richard

Object detection is one of the key algorithms in automatic driving systems. Aiming at addressing the problem of false detection and the missed detection of both small and occluded objects in automatic driving scenarios, an improved Faster-RCNN object detection algorithm is proposed. First, deformable convolution and a spatial attention mechanism are used to improve the ResNet-50 backbone network to enhance the feature extraction of small objects; then, an improved feature pyramid structure is introduced to reduce the loss of features in the fusion process. Three cascade detectors are introduced to solve the problem of IOU (Intersection-Over-Union) threshold mismatch, and side-aware boundary localization is applied for frame regression. Finally, Soft-NMS (Soft Non-maximum Suppression) is used to remove bounding boxes to obtain the best results. The experimental results show that the improved Faster-RCNN can better detect small objects and occluded objects, and its accuracy is 7.7% and 4.1% respectively higher than that of the baseline in the eight categories selected from the COCO2017 and BDD100k data sets.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Xu Han ◽  
Lining Zhao ◽  
Yue Ning ◽  
Jingfeng Hu

The application of ship detection for assistant intelligent ship navigation has stringent requirements for the model’s detection speed and accuracy. In response to this problem, this study uses an improved YOLO-V4 detection model (ShipYOLO) to detect ships. Compared to YOLO-V4, the model has three main improvements. Firstly, the backbone network (CSPDarknet) of YOLO-V4 is optimized. In the training process, the 3  ×  3 convolution, 1  ×  1 convolution, and identity parallel mode are used to replace the original feature extraction component (ResUnit) and more features are extracted. In the inference process, the branch parameters are combined to form a new backbone network named RCSPDarknet, which improves the inference speed of the model while improving the accuracy. Secondly, in order to solve the problem of missed detection of the small-scale ships, we designed a new amplified receptive field module named DSPP with dilated convolution and Max-Pooling, which improves the model’s acquisition of small-scale ship spatial information and robustness of ship target space displacement. Finally, we use the attention mechanism and Resnet’s shortcut idea to improve the feature pyramid structure (PAFPN) of YOLO-V4 and get a new feature pyramid structure named AtFPN. The structure effectively improves the model’s feature extraction effect for ships of different scales and reduces the number of model parameters, further improving the model’s inference speed and detection accuracy. In addition, we have created a ship dataset with a total of 2238 images, which is a single-category dataset. The experimental results show that ShipYOLO has the advantage of faster speed and higher accuracy even in different input sizes. Considering the input size of 320  ×  320 on the PC equipped with NVIDIA 1080Ti GPU, the FPS and mAP@5 : 5:95 (mAP90) of ShipYOLO are increased by 23.7% and 13.6% (10.6%), respectively, with an input size of 320  ×  320, ShipYOLO, compared to YOLO-V4.


2019 ◽  
Vol 9 (18) ◽  
pp. 3781 ◽  
Author(s):  
Yadan Li ◽  
Zhenqi Han ◽  
Haoyu Xu ◽  
Lizhuang Liu ◽  
Xiaoqiang Li ◽  
...  

Due to the high proportion of aircraft faults caused by cracks in aircraft structures, crack inspection in aircraft structures has long played an important role in the aviation industry. The existing approaches, however, are time-consuming or have poor accuracy, given the complex background of aircraft structure images. In order to solve these problems, we propose the YOLOv3-Lite method, which combines depthwise separable convolution, feature pyramids, and YOLOv3. Depthwise separable convolution is employed to design the backbone network for reducing parameters and for extracting crack features effectively. Then, the feature pyramid joins together low-resolution, semantically strong features at a high-resolution for obtaining rich semantics. Finally, YOLOv3 is used for the bounding box regression. YOLOv3-Lite is a fast and accurate crack detection method, which can be used on aircraft structure such as fuselage or engine blades. The result shows that, with almost no loss of detection accuracy, the speed of YOLOv3-Lite is 50% more than that of YOLOv3. It can be concluded that YOLOv3-Lite can reach state-of-the-art performance.


2020 ◽  
Vol 21 (4) ◽  
pp. 1525-1535 ◽  
Author(s):  
Fan Yang ◽  
Lei Zhang ◽  
Sijia Yu ◽  
Danil Prokhorov ◽  
Xue Mei ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document