scholarly journals Integration Between Cascade Region-Based Convolutional Neural Network and Bi-Directional Feature Pyramid Network for Live Object Tracking and Detection

2021 ◽  
Vol 38 (4) ◽  
pp. 1253-1257
Author(s):  
Lehai Zhong ◽  
Jiao Li ◽  
Feifan Zhou ◽  
Xiaoan Bao ◽  
Weiyin Xing ◽  
...  

The current target tracking and detection algorithms often have mistakes and omissions when the target is occluded or small. To overcome the defects, this paper integrates bi-directional feature pyramid network (BiFPN) into cascade region-based convolutional neural network (R-CNN) for live object tracking and detection. Specifically, the BiFPN structure was utilized to connect between scales and fuse weighted features more efficiently, thereby enhancing the network’s feature extraction ability, and improving the detection effect on occluded and small targets. The proposed method, i.e., Cascade R-CNN fused with BiFPN, was compared with target detection algorithms like Cascade R-CNN and single shot detection (SSD) on a video frame dataset of wild animals. Our method achieved a mean average precision (mAP) of 91%, higher than that of SSD and Cascade R-CNN. Besides, it only took 0.42s for our method to detect each image, i.e., the real-time detection was realized. Experimental results prove that the proposed live object tracking and detection model, i.e., Cascade R-CNN fused with BiFPN, can adapt well to the complex detection environment, and achieve an excellent detection effect.

2020 ◽  
Vol 2020 ◽  
pp. 1-22
Author(s):  
Xiaoran Feng ◽  
Liyang Xiao ◽  
Wei Li ◽  
Lili Pei ◽  
Zhaoyun Sun ◽  
...  

Pavement damage is the main factor affecting road performance. Pavement cracking, a common type of road damage, is a key challenge in road maintenance. In order to achieve an accurate crack classification, segmentation, and geometric parameter calculation, this paper proposes a method based on a deep convolutional neural network fusion model for pavement crack identification, which combines the advantages of the multitarget single-shot multibox detector (SSD) convolutional neural network model and the U-Net model. First, the crack classification and detection model is applied to classify the cracks and obtain the detection confidence. Next, the crack segmentation network is applied to accurately segment the pavement cracks. By improving the feature extraction structure and optimizing the hyperparameters of the model, pavement crack classification and segmentation accuracy were improved. Finally, the length and width (for linear cracks) and the area (for alligator cracks) are calculated according to the segmentation results. Test results show that the recognition accuracy of the pavement crack identification method for transverse, longitudinal, and alligator cracks is 86.8%, 87.6%, and 85.5%, respectively. It is demonstrated that the proposed method can provide the category information for pavement cracks as well as the accurate positioning and geometric parameter information, which can be used directly for evaluating the pavement condition.


2021 ◽  
Vol 12 ◽  
Author(s):  
Zhenhui Zheng ◽  
Juntao Xiong ◽  
Huan Lin ◽  
Yonglin Han ◽  
Baoxia Sun ◽  
...  

The accurate detection of green citrus in natural environments is a key step in realizing the intelligent harvesting of citrus through robotics. At present, the visual detection algorithms for green citrus in natural environments still have poor accuracy and robustness due to the color similarity between fruits and backgrounds. This study proposed a multi-scale convolutional neural network (CNN) named YOLO BP to detect green citrus in natural environments. Firstly, the backbone network, CSPDarknet53, was trimmed to extract high-quality features and improve the real-time performance of the network. Then, by removing the redundant nodes of the Path Aggregation Network (PANet) and adding additional connections, a bi-directional feature pyramid network (Bi-PANet) was proposed to efficiently fuse the multilayer features. Finally, three groups of green citrus detection experiments were designed to evaluate the network performance. The results showed that the accuracy, recall, mean average precision (mAP), and detection speed of YOLO BP were 86, 91, and 91.55% and 18 frames per second (FPS), respectively, which were 2, 7, and 4.3% and 1 FPS higher than those of YOLO v4. The proposed detection algorithm had strong robustness and high accuracy in the complex orchard environment, which provides technical support for green fruit detection in natural environments.


2021 ◽  
Vol 13 (10) ◽  
pp. 1953
Author(s):  
Seyed Majid Azimi ◽  
Maximilian Kraus ◽  
Reza Bahmanyar ◽  
Peter Reinartz

In this paper, we address various challenges in multi-pedestrian and vehicle tracking in high-resolution aerial imagery by intensive evaluation of a number of traditional and Deep Learning based Single- and Multi-Object Tracking methods. We also describe our proposed Deep Learning based Multi-Object Tracking method AerialMPTNet that fuses appearance, temporal, and graphical information using a Siamese Neural Network, a Long Short-Term Memory, and a Graph Convolutional Neural Network module for more accurate and stable tracking. Moreover, we investigate the influence of the Squeeze-and-Excitation layers and Online Hard Example Mining on the performance of AerialMPTNet. To the best of our knowledge, we are the first to use these two for regression-based Multi-Object Tracking. Additionally, we studied and compared the L1 and Huber loss functions. In our experiments, we extensively evaluate AerialMPTNet on three aerial Multi-Object Tracking datasets, namely AerialMPT and KIT AIS pedestrian and vehicle datasets. Qualitative and quantitative results show that AerialMPTNet outperforms all previous methods for the pedestrian datasets and achieves competitive results for the vehicle dataset. In addition, Long Short-Term Memory and Graph Convolutional Neural Network modules enhance the tracking performance. Moreover, using Squeeze-and-Excitation and Online Hard Example Mining significantly helps for some cases while degrades the results for other cases. In addition, according to the results, L1 yields better results with respect to Huber loss for most of the scenarios. The presented results provide a deep insight into challenges and opportunities of the aerial Multi-Object Tracking domain, paving the way for future research.


Author(s):  
Fei Rong ◽  
Li Shasha ◽  
Xu Qingzheng ◽  
Liu Kun

The Station logo is a way for a TV station to claim copyright, which can realize the analysis and understanding of the video by the identification of the station logo, so as to ensure that the broadcasted TV signal will not be illegally interfered. In this paper, we design a station logo detection method based on Convolutional Neural Network by the characteristics of the station, such as small scale-to-height ratio change and relatively fixed position. Firstly, in order to realize the preprocessing and feature extraction of the station data, the video samples are collected, filtered, framed, labeled and processed. Then, the training sample data and the test sample data are divided proportionally to train the station detection model. Finally, the sample is tested to evaluate the effect of the training model in practice. The simulation experiments prove its validity.


2020 ◽  
pp. 808-817
Author(s):  
Vinh Pham ◽  
◽  
Eunil Seo ◽  
Tai-Myoung Chung

Identifying threats contained within encrypted network traffic poses a great challenge to Intrusion Detection Systems (IDS). Because traditional approaches like deep packet inspection could not operate on encrypted network traffic, machine learning-based IDS is a promising solution. However, machine learning-based IDS requires enormous amounts of statistical data based on network traffic flow as input data and also demands high computing power for processing, but is slow in detecting intrusions. We propose a lightweight IDS that transforms raw network traffic into representation images. We begin by inspecting the characteristics of malicious network traffic of the CSE-CIC-IDS2018 dataset. We then adapt methods for effectively representing those characteristics into image data. A Convolutional Neural Network (CNN) based detection model is used to identify malicious traffic underlying within image data. To demonstrate the feasibility of the proposed lightweight IDS, we conduct three simulations on two datasets that contain encrypted traffic with current network attack scenarios. The experiment results show that our proposed IDS is capable of achieving 95% accuracy with a reasonable detection time while requiring relatively small size training data.


Sign in / Sign up

Export Citation Format

Share Document