scholarly journals An Efficient Pedestrian Detection Method Based on YOLOv2

2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Zhongmin Liu ◽  
Zhicai Chen ◽  
Zhanming Li ◽  
Wenjin Hu

In recent years, techniques based on the deep detection model have achieved overwhelming improvements in the accuracy of detection, which makes them being the most adapted for the applications, such as pedestrian detection. However, speed and accuracy are a pair of contradictions that always exist and have long puzzled researchers. How to achieve the good trade-off between them is a problem we must consider while designing the detectors. To this end, we employ the general detector YOLOv2, a state-of-the-art method in the general detection tasks, in the pedestrian detection. Then we modify the network parameters and structures, according to the characteristics of the pedestrians, making this method more suitable for detecting pedestrians. Experimental results in INRIA pedestrian detection dataset show that it has a fairly high detection speed with a small precision gap compared with the state-of-the-art pedestrian detection methods. Furthermore, we add weak semantic segmentation networks after shared convolution layers to illuminate pedestrians and employ a scale-aware structure in our model according to the characteristics of the wide size range in Caltech pedestrian detection dataset, which make great progress under the original improvement.

2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Tao Xiang ◽  
Tao Li ◽  
Mao Ye ◽  
Zijian Liu

Pedestrian detection with large intraclass variations is still a challenging task in computer vision. In this paper, we propose a novel pedestrian detection method based on Random Forest. Firstly, we generate a few local templates with different sizes and different locations in positive exemplars. Then, the Random Forest is built whose splitting functions are optimized by maximizing class purity of matching the local templates to the training samples, respectively. To improve the classification accuracy, we adopt a boosting-like algorithm to update the weights of the training samples in a layer-wise fashion. During detection, the trained Random Forest will vote the category when a sliding window is input. Our contributions are the splitting functions based on local template matching with adaptive size and location and iteratively weight updating method. We evaluate the proposed method on 2 well-known challenging datasets: TUD pedestrians and INRIA pedestrians. The experimental results demonstrate that our method achieves state-of-the-art or competitive performance.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3096
Author(s):  
Zhen Zhang ◽  
Shihao Xia ◽  
Yuxing Cai ◽  
Cuimei Yang ◽  
Shaoning Zeng

Blockage of pedestrians will cause inaccurate people counting, and people’s heads are easily blocked by each other in crowded occasions. To reduce missed detections as much as possible and improve the capability of the detection model, this paper proposes a new people counting method, named Soft-YoloV4, by attenuating the score of adjacent detection frames to prevent the occurrence of missed detection. The proposed Soft-YoloV4 improves the accuracy of people counting and reduces the incorrect elimination of the detection frames when heads are blocked by each other. Compared with the state-of-the-art YoloV4, the AP value of the proposed head detection method is increased from 88.52 to 90.54%. The Soft-YoloV4 model has much higher robustness and a lower missed detection rate for head detection, and therefore it dramatically improves the accuracy of people counting.


2021 ◽  
Vol 233 ◽  
pp. 02012
Author(s):  
Shousheng Liu ◽  
Zhigang Gai ◽  
Xu Chai ◽  
Fengxiang Guo ◽  
Mei Zhang ◽  
...  

Bacterial colonies detecting and counting is tedious and time-consuming work. Fortunately CNN (convolutional neural network) detection methods are effective for target detection. The bacterial colonies are a kind of small targets, which have been a difficult problem in the field of target detection technology. This paper proposes a small target enhancement detection method based on double CNNs, which can not only improve the detection accuracy, but also maintain the detection speed similar to the general detection model. The detection method uses double CNNs. The first CNN uses SSD_MOBILENET_V1 network with both target positioning and target recognition functions. The candidate targets are screened out with a low confidence threshold, which can ensure no missing detection of small targets. The second CNN obtains candidate target regions according to the first round of detection, intercepts image sub-blocks one by one, uses the MOBILENET_V1 network to filter out targets with a higher confidence threshold, which can ensure good detection of small targets. Through the two-round enhancement detection method has been transplanted to the embedded platform NVIDIA Jetson AGX Xavier, the detection accuracy of small targets is significantly improved, and the target error detection rate and missed detection rate are reduced to less than 1%.


2020 ◽  
Vol 16 (10) ◽  
pp. 155014772096133
Author(s):  
Jianhua Wang ◽  
Bang Ji ◽  
Feng Lin ◽  
Shilei Lu ◽  
Yubin Lan ◽  
...  

Quickly detecting related primitive events for multiple complex events from massive event stream usually faces with a great challenge due to their single pattern characteristic of the existing complex event detection methods. Aiming to solve the problem, a multiple pattern complex event detection scheme based on decomposition and merge sharing is proposed in this article. The achievement of this article lies that we successfully use decomposition and merge sharing technology to realize the high-efficient detection for multiple complex events from massive event streams. Specially, in our scheme, we first use decomposition sharing technology to decompose pattern expressions into multiple subexpressions, which can provide many sharing opportunities for subexpressions. We then use merge sharing technology to construct a multiple pattern complex events by merging sharing all the same prefix, suffix, or subpattern into one based on the above decomposition results. As a result, our proposed detection method in this article can effectively solve the above problem. The experimental results show that the proposed detection method in this article outperforms some general detection methods in detection model and detection algorithm in multiple pattern complex event detection as a whole.


2019 ◽  
Vol 11 (18) ◽  
pp. 2173 ◽  
Author(s):  
Jinlei Ma ◽  
Zhiqiang Zhou ◽  
Bo Wang ◽  
Hua Zong ◽  
Fei Wu

To accurately detect ships of arbitrary orientation in optical remote sensing images, we propose a two-stage CNN-based ship-detection method based on the ship center and orientation prediction. Center region prediction network and ship orientation classification network are constructed to generate rotated region proposals, and then we can predict rotated bounding boxes from rotated region proposals to locate arbitrary-oriented ships more accurately. The two networks share the same deconvolutional layers to perform semantic segmentation for the prediction of center regions and orientations of ships, respectively. They can provide the potential center points of the ships helping to determine the more confident locations of the region proposals, as well as the ship orientation information, which is beneficial to the more reliable predetermination of rotated region proposals. Classification and regression are then performed for the final ship localization. Compared with other typical object detection methods for natural images and ship-detection methods, our method can more accurately detect multiple ships in the high-resolution remote sensing image, irrespective of the ship orientations and a situation in which the ships are docked very closely. Experiments have demonstrated the promising improvement of ship-detection performance.


Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 2145 ◽  
Author(s):  
Guoxu Liu ◽  
Joseph Christian Nouaze ◽  
Philippe Lyonel Touko Mbouembe ◽  
Jae Ho Kim

Automatic fruit detection is a very important benefit of harvesting robots. However, complicated environment conditions, such as illumination variation, branch, and leaf occlusion as well as tomato overlap, have made fruit detection very challenging. In this study, an improved tomato detection model called YOLO-Tomato is proposed for dealing with these problems, based on YOLOv3. A dense architecture is incorporated into YOLOv3 to facilitate the reuse of features and help to learn a more compact and accurate model. Moreover, the model replaces the traditional rectangular bounding box (R-Bbox) with a circular bounding box (C-Bbox) for tomato localization. The new bounding boxes can then match the tomatoes more precisely, and thus improve the Intersection-over-Union (IoU) calculation for the Non-Maximum Suppression (NMS). They also reduce prediction coordinates. An ablation study demonstrated the efficacy of these modifications. The YOLO-Tomato was compared to several state-of-the-art detection methods and it had the best detection performance.


Author(s):  
Bo Chen ◽  
Hua Zhang ◽  
Yonglong Li ◽  
Shuang Wang ◽  
Huaifang Zhou ◽  
...  

Abstract An increasing number of detection methods based on computer vision are applied to detect cracks in water conservancy infrastructure. However, most studies directly use existing feature extraction networks to extract cracks information, which are proposed for open-source datasets. As the cracks distribution and pixel features are different from these data, the extracted cracks information is incomplete. In this paper, a deep learning-based network for dam surface crack detection is proposed, which mainly addresses the semantic segmentation of cracks on the dam surface. Particularly, we design a shallow encoding network to extract features of crack images based on the statistical analysis of cracks. Further, to enhance the relevance of contextual information, we introduce an attention module into the decoding network. During the training, we use the sum of Cross-Entropy and Dice Loss as the loss function to overcome data imbalance. The quantitative information of cracks is extracted by the imaging principle after using morphological algorithms to extract the morphological features of the predicted result. We built a manual annotation dataset containing 1577 images to verify the effectiveness of the proposed method. This method achieves the state-of-the-art performance on our dataset. Specifically, the precision, recall, IoU, F1_measure, and accuracy achieve 90.81%, 81.54%, 75.23%, 85.93%, 99.76%, respectively. And the quantization error of cracks is less than 4%.


2018 ◽  
Vol 232 ◽  
pp. 04036
Author(s):  
Jun Yin ◽  
Huadong Pan ◽  
Hui Su ◽  
Zhonggeng Liu ◽  
Zhirong Peng

We propose an object detection method that predicts the orientation bounding boxes (OBB) to estimate objects locations, scales and orientations based on YOLO (You Only Look Once), which is one of the top detection algorithms performing well both in accuracy and speed. Horizontal bounding boxes(HBB), which are not robust to orientation variances, are used in the existing object detection methods to detect targets. The proposed orientation invariant YOLO (OIYOLO) detector can effectively deal with the bird’s eye viewpoint images where the orientation angles of the objects are arbitrary. In order to estimate the rotated angle of objects, we design a new angle loss function. Therefore, the training of OIYOLO forces the network to learn the annotated orientation angle of objects, making OIYOLO orientation invariances. The proposed approach that predicts OBB can be applied in other detection frameworks. In additional, to evaluate the proposed OIYOLO detector, we create an UAV-DAHUA datasets that annotated with objects locations, scales and orientation angles accurately. Extensive experiments conducted on UAV-DAHUA and DOTA datasets demonstrate that OIYOLO achieves state-of-the-art detection performance with high efficiency comparing with the baseline YOLO algorithms.


2020 ◽  
Vol 10 (19) ◽  
pp. 6799
Author(s):  
Zhuoran Ma ◽  
Liang Gao ◽  
Yanglong Zhong ◽  
Shuai Ma ◽  
Bolun An

During the long-term service of slab track, various external factors (such as complicated temperature) can result in a series of slab damages. Among them, slab arching changes the structural mechanical properties, deteriorates the track geometry conditions, and even threatens the operation of trains. Therefore, it is necessary to detect slab arching accurately to achieve effective maintenance. However, the current damage detection methods cannot satisfy high accuracy and low cost simultaneously, making it difficult to achieve large-scale and efficient arching detection. To this end, this paper proposed a vision-based arching detection method using track geometry data. The main works include: (1) data nonlinear deviation correction and arching characteristics analysis; (2) data conversion and augmentation; (3) design and experiments of convolutional neural network- based detection model. The results show that the proposed method can detect arching damages effectively, and the F1-score reaches 98.4%. By balancing the sample size of each pattern, the performance can be further improved. Moreover, the method outperforms the plain deep learning network. In practice, the proposed method can be employed to detect slab arching and help to make maintenance plans. The method can also be applied to the data-based detection of other structural damages and has broad prospects.


Entropy ◽  
2021 ◽  
Vol 23 (12) ◽  
pp. 1587
Author(s):  
Mingfeng Zha ◽  
Wenbin Qian ◽  
Wenlong Yi ◽  
Jing Hua

Traditional pest detection methods are challenging to use in complex forestry environments due to their low accuracy and speed. To address this issue, this paper proposes the YOLOv4_MF model. The YOLOv4_MF model utilizes MobileNetv2 as the feature extraction block and replaces the traditional convolution with depth-wise separated convolution to reduce the model parameters. In addition, the coordinate attention mechanism was embedded in MobileNetv2 to enhance feature information. A symmetric structure consisting of a three-layer spatial pyramid pool is presented, and an improved feature fusion structure was designed to fuse the target information. For the loss function, focal loss was used instead of cross-entropy loss to enhance the network’s learning of small targets. The experimental results showed that the YOLOv4_MF model has 4.24% higher mAP, 4.37% higher precision, and 6.68% higher recall than the YOLOv4 model. The size of the proposed model was reduced to 1/6 of that of YOLOv4. Moreover, the proposed algorithm achieved 38.62% mAP with respect to some state-of-the-art algorithms on the COCO dataset.


Sign in / Sign up

Export Citation Format

Share Document