scholarly journals IEPet: A Lightweight Multiscale Infrared Environmental Perception Network

2021 ◽  
Vol 2078 (1) ◽  
pp. 012063
Author(s):  
Xinhao Jiang ◽  
Wei Cai ◽  
Zhiyong Yang ◽  
Peiwei Xu ◽  
Qingjiang Dong

Abstract In recent years, the development of unmanned driving technology requires continuous progress in environment perception technology. Aiming at the key research direction of infrared environment perception in unmanned driving technology, a lightweight real-time detection network model for infrared environment perception, IEPet, is proposed. The model backbone adds the BottleneckCSP module and the proposed DCAP attention module, which can significantly improve the detection ability and spatial position perception ability while maintaining light weight. At the same time, the model improves the detection accuracy by using a 3-scales detection head. Comparative experiments on the unmanned driving data set show that compared with the lightweight model YOLOv4-tiny, the model proposed in this paper has an increase in F1 Score by 1.48% and an average detection accuracy by 6.37% to reach 84.31%. And the model is lighter. It shows that the proposed IEPet model can better meet the excellent performance required for infrared environment perception.

Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 544
Author(s):  
Qiwei Xu ◽  
Hong Huang ◽  
Chuan Zhou ◽  
Xuefeng Zhang

Currently, infrared fault diagnosis mainly relies on manual inspection and low detection efficiency. This paper proposes an improved YOLOv3 network for detecting the working state of substation high-voltage lead connectors. Firstly, dilated convolution is introduced into the YOLOv3 backbone network to process low-resolution element layers, so as to enhance the network’s extraction of image features, promote function propagation and reuse, and improve the network’s recognition performance of small targets. Then the fault detection model of the infrared image of the high voltage lead connector is created and the optimal infrared image test data set is obtained through multi-scale training. Finally, the performance of the improved network model is tested on the data set. The test results show that the improved YOLOv3 network model has an average detection accuracy of 84.26% for infrared image faults of high-voltage lead connectors, which is 4.58% higher than the original YOLOv3 network model. The improved YOLOv3 network model has an average detection time of 0.308 s for infrared image faults of high-voltage lead connectors, which can be used for real-time detection in substations.


2021 ◽  
pp. 1-11
Author(s):  
Tingting Zhao ◽  
Xiaoli Yi ◽  
Zhiyong Zeng ◽  
Tao Feng

YTNR (Yunnan Tongbiguan Nature Reserve) is located in the westernmost part of China’s tropical regions and is the only area in China with the tropical biota of the Irrawaddy River system. The reserve has abundant tropical flora and fauna resources. In order to realize the real-time detection of wild animals in this area, this paper proposes an improved YOLO (You only look once) network. The original YOLO model can achieve higher detection accuracy, but due to the complex model structure, it cannot achieve a faster detection speed on the CPU detection platform. Therefore, the lightweight network MobileNet is introduced to replace the backbone feature extraction network in YOLO, which realizes real-time detection on the CPU platform. In response to the difficulty in collecting wild animal image data, the research team deployed 50 high-definition cameras in the study area and conducted continuous observations for more than 1,000 hours. In the end, this research uses 1410 images of wildlife collected in the field and 1577 wildlife images from the internet to construct a research data set combined with the manual annotation of domain experts. At the same time, transfer learning is introduced to solve the problem of insufficient training data and the network is difficult to fit. The experimental results show that our model trained on a training set containing 2419 animal images has a mean average precision of 93.6% and an FPS (Frame Per Second) of 3.8 under the CPU. Compared with YOLO, the mean average precision is increased by 7.7%, and the FPS value is increased by 3.


2019 ◽  
Vol 8 (3) ◽  
pp. 6069-6076

Many computer vision applications needs to detect moving object from an input video sequences. The main applications of this are traffic monitoring, visual surveillance, people tracking and security etc. Among these, traffic monitoring is one of the most difficult tasks in real time video processing. Many algorithms are introduced to monitor traffic accurately. But most of the cases, the detection accuracy is very less and the detection time is higher which makes the algorithms are not suitable for real time applications. In this paper, a new technique to detect moving vehicle efficiently using Modified Gaussian Mixture Model and Modified Blob Detection techniques is proposed. The modified Gaussian Mixture model generates the background from overall probability of the complete data set and by calculating the required step size from the frame differences. The modified Blob Analysis is then used to classify proper moving objects. The simulation results shows that the method accurately detect the target


2018 ◽  
Vol 189 ◽  
pp. 10023
Author(s):  
Wenhui Zhang ◽  
Wentong Wang ◽  
Shuang Zhao ◽  
Bin Sun

Compared with the traditional statistical models, such as the active shape model and the active appearance model, the facial feature point localization method based on deep learning has improved in accuracy and speed, but there still exist some problems. First, when the traditional deep neural network model targets a data set containing different face poses, it only performs the preprocessing through the initialized face alignment, and does not consider the regularity of the distribution of the feature points corresponding to the face pose during feature extraction. Secondly, the traditional deep neural network model does not take into account the feature space differences caused by the different position distribution of the external contour points and internal organ points (such as eyes, nose and mouth), resulting in inconsistent detection accuracy and difficulty of different feature points. In order to solve the above problems this paper proposes a convolutional neural network (CNN) based on grayedge-HOG (GEH) fusion feature.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8160
Author(s):  
Meijing Gao ◽  
Yang Bai ◽  
Zhilong Li ◽  
Shiyu Li ◽  
Bozhi Zhang ◽  
...  

In recent years, jellyfish outbreaks have frequently occurred in offshore areas worldwide, posing a significant threat to the marine fishery, tourism, coastal industry, and personal safety. Effective monitoring of jellyfish is a vital method to solve the above problems. However, the optical detection method for jellyfish is still in the primary stage. Therefore, this paper studies a jellyfish detection method based on convolution neural network theory and digital image processing technology. This paper studies the underwater image preprocessing algorithm because the quality of underwater images directly affects the detection results. The results show that the image quality is better after applying the three algorithms namely prior defogging, adaptive histogram equalization, and multi-scale retinal enhancement, which is more conducive to detection. We establish a data set containing seven species of jellyfishes and fish. A total of 2141 images are included in the data set. The YOLOv3 algorithm is used to detect jellyfish, and its feature extraction network Darknet53 is optimized to ensure it is conducted in real-time. In addition, we introduce label smoothing and cosine annealing learning rate methods during the training process. The experimental results show that the improved algorithms improve the detection accuracy of jellyfish on the premise of ensuring the detection speed. This paper lays a foundation for the construction of an underwater jellyfish optical imaging real-time monitoring system.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4357 ◽  
Author(s):  
Babak Shahian Jahromi ◽  
Theja Tulabandhula ◽  
Sabri Cetin

There are many sensor fusion frameworks proposed in the literature using different sensors and fusion methods combinations and configurations. More focus has been on improving the accuracy performance; however, the implementation feasibility of these frameworks in an autonomous vehicle is less explored. Some fusion architectures can perform very well in lab conditions using powerful computational resources; however, in real-world applications, they cannot be implemented in an embedded edge computer due to their high cost and computational need. We propose a new hybrid multi-sensor fusion pipeline configuration that performs environment perception for autonomous vehicles such as road segmentation, obstacle detection, and tracking. This fusion framework uses a proposed encoder-decoder based Fully Convolutional Neural Network (FCNx) and a traditional Extended Kalman Filter (EKF) nonlinear state estimator method. It also uses a configuration of camera, LiDAR, and radar sensors that are best suited for each fusion method. The goal of this hybrid framework is to provide a cost-effective, lightweight, modular, and robust (in case of a sensor failure) fusion system solution. It uses FCNx algorithm that improve road detection accuracy compared to benchmark models while maintaining real-time efficiency that can be used in an autonomous vehicle embedded computer. Tested on over 3K road scenes, our fusion algorithm shows better performance in various environment scenarios compared to baseline benchmark networks. Moreover, the algorithm is implemented in a vehicle and tested using actual sensor data collected from a vehicle, performing real-time environment perception.


2021 ◽  
pp. 1-10
Author(s):  
Zhixiong Chen ◽  
Shengwei Tian ◽  
Long Yu ◽  
Liqiang Zhang ◽  
Xinyu Zhang

In recent years, the research on object detection has been intensified. A large number of object detection results are applied to our daily life, which greatly facilitates our work and life. In this paper, we propose a more effective object detection neural network model ENHANCE_YOLOV4. We studied the effects of several attention mechanisms on YOLOV4, and finally concluded that spatial attention mechanism had the best effect on YOLOV4. Therefore, based on previous studies, this paper introduces Dilated Convolution and one-by-one convolution into the spatial attention mechanism to expand the receptive field and combine channel information. Compared with CBAM and BAM, which are composed of spatial attention and channel attention, this improved spatial attention module reduces model parameters and improves detection capabilities. We built a new network model by embedding improved spatial attention module in the appropriate place in YOLOV4. And this paper proves that the detection accuracy of this network structure on the VOC data set is increased by 0.8%, and the detection accuracy on the coco data set is increased by 7%when the calculation performance is increased a little.


Fractals ◽  
2020 ◽  
Vol 28 (08) ◽  
pp. 2040021
Author(s):  
GAOYUAN CUI ◽  
BIN ZHANG ◽  
RODRIGUES MARLENE

This paper focuses on the design of badminton robots, and designs high-precision binocular stereo vision synchronous acquisition system hardware and multithreaded acquisition programs to ensure the left and right camera exposure synchronization and timely reading of data. Aiming at specific weak moving targets, a shape-based Brown motion model based on dynamic threshold adjustment based on singular value decomposition is proposed, and a discriminative threshold is set according to the similarity between the background and the foreground to improve detection accuracy. The three-dimensional trajectory points are extended by Kalman filter and the kinematics equation of badminton is established. The parameters of the kinematics equation of badminton are solved by the method of least squares. Based on the fractal Brownian motion algorithm, a real-time robot pose estimation algorithm is proposed to realize the real-time accurate pose estimation of the robot. A PID control model for the badminton robot executive mechanism is established between the omnidirectional wheel speed and the robot’s translation and rotation movements to achieve the precise movement of the badminton robot. All the algorithms can meet the system’s requirements for real-time performance, realize the badminton robot’s simple hit to the ball, and prospect the future research direction.


2021 ◽  
Vol 12 ◽  
Author(s):  
Wenli Zhang ◽  
Yuxin Liu ◽  
Kaizhen Chen ◽  
Huibin Li ◽  
Yulin Duan ◽  
...  

In recent years, deep-learning-based fruit-detection technology has exhibited excellent performance in modern horticulture research. However, deploying deep learning algorithms in real-time field applications is still challenging, owing to the relatively low image processing capability of edge devices. Such limitations are becoming a new bottleneck and hindering the utilization of AI algorithms in modern horticulture. In this paper, we propose a lightweight fruit-detection algorithm, specifically designed for edge devices. The algorithm is based on Light-CSPNet as the backbone network, an improved feature-extraction module, a down-sampling method, and a feature-fusion module, and it ensures real-time detection on edge devices while maintaining the fruit-detection accuracy. The proposed algorithm was tested on three edge devices: NVIDIA Jetson Xavier NX, NVIDIA Jetson TX2, and NVIDIA Jetson NANO. The experimental results show that the average detection precision of the proposed algorithm for orange, tomato, and apple datasets are 0.93, 0.847, and 0.850, respectively. Deploying the algorithm, the detection speed of NVIDIA Jetson Xavier NX reaches 21.3, 24.8, and 22.2 FPS, while that of NVIDIA Jetson TX2 reaches 13.9, 14.1, and 14.5 FPS and that of NVIDIA Jetson NANO reaches 6.3, 5.0, and 8.5 FPS for the three datasets. Additionally, the proposed algorithm provides a component add/remove function to flexibly adjust the model structure, considering the trade-off between the detection accuracy and speed in practical usage.


2020 ◽  
Vol 10 (9) ◽  
pp. 3079 ◽  
Author(s):  
Yi-Qi Huang ◽  
Jia-Chun Zheng ◽  
Shi-Dan Sun ◽  
Cheng-Fu Yang ◽  
Jing Liu

In the intelligent traffic system, real-time and accurate detections of vehicles in images and video data are very important and challenging work. Especially in situations with complex scenes, different models, and high density, it is difficult to accurately locate and classify these vehicles during traffic flows. Therefore, we propose a single-stage deep neural network YOLOv3-DL, which is based on the Tensorflow framework to improve this problem. The network structure is optimized by introducing the idea of spatial pyramid pooling, then the loss function is redefined, and a weight regularization method is introduced, for that, the real-time detections and statistics of traffic flows can be implemented effectively. The optimization algorithm we use is the DL-CAR data set for end-to-end network training and experiments with data sets under different scenarios and weathers. The analyses of experimental data show that the optimized algorithm can improve the vehicles’ detection accuracy on the test set by 3.86%. Experiments on test sets in different environments have improved the detection accuracy rate by 4.53%, indicating that the algorithm has high robustness. At the same time, the detection accuracy and speed of the investigated algorithm are higher than other algorithms, indicating that the algorithm has higher detection performance.


Sign in / Sign up

Export Citation Format

Share Document