scholarly journals A Vehicle and Pedestrian Detection Method Based on Improved YOLOv4-Tiny

2022 ◽  
Vol 11 (01) ◽  
pp. 22-26
Author(s):  
Hui Xiang ◽  
Junyan Han ◽  
Hanqing Wang ◽  
Hao Li ◽  
Shangqing Li ◽  
...  

Aiming at the problems of low detection accuracy and poor recognition effect of small-scale targets in traditional vehicle and pedestrian detection methods, a vehicle and pedestrian detection method based on improved YOLOv4-Tiny is proposed. On the basis of YOLOv4-Tiny, the 8-fold down sampling feature layer was added for feature fusion, the PANet structure was used to perform bidirectional fusion for the deep and shallow features from the output feature layer of backbone network, and the detection head for small targets was added. The results show that the mean average precision of the improved method has reached 85.93%, and the detection performance is similar to that of YOLOv4. Compared with the YOLOv4-Tiny, the mean average precision of the improved method is increased by 24.45%, and the detection speed reaches 67.83FPS, which means that the detection effect is significantly improved and can meet the real-time requirements.

2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Bao-Yuan Chen ◽  
Yu-Kun Shen ◽  
Kun Sun

At present, object detectors based on convolution neural networks generally rely on the last layer of features extracted by the feature extraction network. In the process of continuous convolution and pooling of deep features, the position information cannot be completely transferred backward. This paper proposes a multiscale feature reuse detection model, which includes the basic feature extraction network DenseNet, feature fusion network, multiscale anchor region proposal network, and classification and regression network. The fusion of high-dimensional features and low-dimensional features not only strengthens the model's sensitivity to objects of different sizes but also strengthens the transmission of information, so that the feature map has rich deep semantic information and shallow location information at the same time, which significantly improves the robustness and detection accuracy of the model. The algorithm is trained and tested in Pascal VOC2007 dataset. The experimental results show that the mean average precision of the objects in the dataset is 73.87%. At the same time, compared with the mainstream faster RCNN and SSD detection models, the mean average precision of object detection algorithm based on DenseNet is improved by 5.63% and 3.86%, respectively.


Author(s):  
Zhenying Xu ◽  
Ziqian Wu ◽  
Wei Fan

Defect detection of electromagnetic luminescence (EL) cells is the core step in the production and preparation of solar cell modules to ensure conversion efficiency and long service life of batteries. However, due to the lack of feature extraction capability for small feature defects, the traditional single shot multibox detector (SSD) algorithm performs not well in EL defect detection with high accuracy. Consequently, an improved SSD algorithm with modification in feature fusion in the framework of deep learning is proposed to improve the recognition rate of EL multi-class defects. A dataset containing images with four different types of defects through rotation, denoising, and binarization is established for the EL. The proposed algorithm can greatly improve the detection accuracy of the small-scale defect with the idea of feature pyramid networks. An experimental study on the detection of the EL defects shows the effectiveness of the proposed algorithm. Moreover, a comparison study shows the proposed method outperforms other traditional detection methods, such as the SIFT, Faster R-CNN, and YOLOv3, in detecting the EL defect.


Electronics ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 197
Author(s):  
Meng-ting Fang ◽  
Zhong-ju Chen ◽  
Krzysztof Przystupa ◽  
Tao Li ◽  
Michal Majka ◽  
...  

Examination is a way to select talents, and a perfect invigilation strategy can improve the fairness of the examination. To realize the automatic detection of abnormal behavior in the examination room, the method based on the improved YOLOv3 (The third version of the You Only Look Once algorithm) algorithm is proposed. The YOLOv3 algorithm is improved by using the K-Means algorithm, GIoUloss, focal loss, and Darknet32. In addition, the frame-alternate dual-thread method is used to optimize the detection process. The research results show that the improved YOLOv3 algorithm can improve both the detection accuracy and detection speed. The frame-alternate dual-thread method can greatly increase the detection speed. The mean Average Precision (mAP) of the improved YOLOv3 algorithm on the test set reached 88.53%, and the detection speed reached 42 Frames Per Second (FPS) in the frame-alternate dual-thread detection method. The research results provide a certain reference for automated invigilation.


2021 ◽  
Vol 2035 (1) ◽  
pp. 012023
Author(s):  
Yuhao You ◽  
Houjin Chen ◽  
Yanfeng Li ◽  
Minjun Wang ◽  
Jinlei Zhu

Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 112
Author(s):  
Yuhang Liu ◽  
Jianxiao Ma ◽  
Yuchen Wang ◽  
Chenhong Zong

Pedestrian detection is widely used in cooperative vehicle infrastructure systems. Traditional pedestrian detection methods perform sufficiently well under sunny scenarios and obtain trustworthy traffic data. However, the detection drastically decreases under rainy scenarios. This study proposes a pedestrian detection algorithm with a de-raining module that improves detection accuracy under various rainy scenarios. Specifically, this algorithm determines the density information of rain and effectively removes rain streaks through the de-raining module. Then the algorithm detects pedestrians as a pair of keypoints through the pedestrian detection module to solve the problem of occlusion. Furthermore, a new pedestrian dataset containing rain density labels is established and used to train the algorithm. For the scenarios of light, medium, and heavy rain, extensive experiments on synthetic datasets demonstrate that the proposed algorithm increases AP (average precision) of pedestrian detection by 21.1%, 48.1%, and 60.9%. Moreover, the proposed algorithm performs well on real datasets and achieves improvements over the state-of-the-art methods, which reveals that the proposed algorithm can significantly improve the accuracy of pedestrian detection in rainy scenarios.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Liming Zhou ◽  
Haoxin Yan ◽  
Chang Zheng ◽  
Xiaohan Rao ◽  
Yahui Li ◽  
...  

Aircraft, as one of the indispensable transport tools, plays an important role in military activities. Therefore, it is a significant task to locate the aircrafts in the remote sensing images. However, the current object detection methods cause a series of problems when applied to the aircraft detection for the remote sensing image, for instance, the problems of low rate of detection accuracy and high rate of missed detection. To address the problems of low rate of detection accuracy and high rate of missed detection, an object detection method for remote sensing image based on bidirectional and dense feature fusion is proposed to detect aircraft targets in sophisticated environments. On the fundamental of the YOLOv3 detection framework, this method adds a feature fusion module to enrich the details of the feature map by mixing the shallow features with the deep features together. Experimental results on the RSOD-DataSet and NWPU-DataSet indicate that the new method raised in the article is capable of improving the problems of low rate of detection accuracy and high rate of missed detection. Meanwhile, the AP for the aircraft increases by 1.57% compared with YOLOv3.


2018 ◽  
Vol 10 (1) ◽  
pp. 57-64 ◽  
Author(s):  
Rizqa Raaiqa Bintana ◽  
Chastine Fatichah ◽  
Diana Purwitasari

Community-based question answering (CQA) is formed to help people who search information that they need through a community. One condition that may occurs in CQA is when people cannot obtain the information that they need, thus they will post a new question. This condition can cause CQA archive increased because of duplicated questions. Therefore, it becomes important problems to find semantically similar questions from CQA archive towards a new question. In this study, we use convolutional neural network methods for semantic modeling of sentence to obtain words that they represent the content of documents and new question. The result for the process of finding the same question semantically to a new question (query) from the question-answer documents archive using the convolutional neural network method, obtained the mean average precision value is 0,422. Whereas by using vector space model, as a comparison, obtained mean average precision value is 0,282. Index Terms—community-based question answering, convolutional neural network, question retrieval


2021 ◽  
pp. 1-11
Author(s):  
Tingting Zhao ◽  
Xiaoli Yi ◽  
Zhiyong Zeng ◽  
Tao Feng

YTNR (Yunnan Tongbiguan Nature Reserve) is located in the westernmost part of China’s tropical regions and is the only area in China with the tropical biota of the Irrawaddy River system. The reserve has abundant tropical flora and fauna resources. In order to realize the real-time detection of wild animals in this area, this paper proposes an improved YOLO (You only look once) network. The original YOLO model can achieve higher detection accuracy, but due to the complex model structure, it cannot achieve a faster detection speed on the CPU detection platform. Therefore, the lightweight network MobileNet is introduced to replace the backbone feature extraction network in YOLO, which realizes real-time detection on the CPU platform. In response to the difficulty in collecting wild animal image data, the research team deployed 50 high-definition cameras in the study area and conducted continuous observations for more than 1,000 hours. In the end, this research uses 1410 images of wildlife collected in the field and 1577 wildlife images from the internet to construct a research data set combined with the manual annotation of domain experts. At the same time, transfer learning is introduced to solve the problem of insufficient training data and the network is difficult to fit. The experimental results show that our model trained on a training set containing 2419 animal images has a mean average precision of 93.6% and an FPS (Frame Per Second) of 3.8 under the CPU. Compared with YOLO, the mean average precision is increased by 7.7%, and the FPS value is increased by 3.


2012 ◽  
Vol 572 ◽  
pp. 338-342 ◽  
Author(s):  
Zhi Guo Liang ◽  
Quan Yang ◽  
Ke Xu ◽  
Fei He ◽  
Xiao Chen Wang ◽  
...  

Structured light 3D measurement technology with its simple structure, non-contact measurement, fast measurement speed and other advantages, has been widely used. Steel plate surface quality detection is not confined to the two-dimensional feature of gray detection, and local topography measurement for surface quality of steel plate detection becomes increasingly important. In this paper, steel plate surface 3D detection method based on structured light and the factors affecting the measurement accuracy are analyzed. Several effective methods of improving 3D detection accuracy are put forward. Compared with the traditional structured light 3D detection methods, the detection accuracy of new methods is remarkably improved, thus possessing better application values.


Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1507
Author(s):  
Feiyu Zhang ◽  
Luyang Zhang ◽  
Hongxiang Chen ◽  
Jiangjian Xie

Deep convolutional neural networks (DCNNs) have achieved breakthrough performance on bird species identification using a spectrogram of bird vocalization. Aiming at the imbalance of the bird vocalization dataset, a single feature identification model (SFIM) with residual blocks and modified, weighted, cross-entropy function was proposed. To further improve the identification accuracy, two multi-channel fusion methods were built with three SFIMs. One of these fused the outputs of the feature extraction parts of three SFIMs (feature fusion mode), the other fused the outputs of the classifiers of three SFIMs (result fusion mode). The SFIMs were trained with three different kinds of spectrograms, which were calculated through short-time Fourier transform, mel-frequency cepstrum transform and chirplet transform, respectively. To overcome the shortage of the huge number of trainable model parameters, transfer learning was used in the multi-channel models. Using our own vocalization dataset as a sample set, it is found that the result fusion mode model outperforms the other proposed models, the best mean average precision (MAP) reaches 0.914. Choosing three durations of spectrograms, 100 ms, 300 ms and 500 ms for comparison, the results reveal that the 300 ms duration is the best for our own dataset. The duration is suggested to be determined based on the duration distribution of bird syllables. As for the performance with the training dataset of BirdCLEF2019, the highest classification mean average precision (cmAP) reached 0.135, which means the proposed model has certain generalization ability.


Sign in / Sign up

Export Citation Format

Share Document