A Real-Time Object Detector for Autonomous Vehicles Based on YOLOv4

Object detection is an important part of autonomous driving technology. To ensure the safe running of vehicles at high speed, real-time and accurate detection of all the objects on the road is required. How to balance the speed and accuracy of detection is a hot research topic in recent years. This paper puts forward a one-stage object detection algorithm based on YOLOv4, which improves the detection accuracy and supports real-time operation. The backbone of the algorithm doubles the stacking times of the last residual block of CSPDarkNet53. The neck of the algorithm replaces the SPP with the RFB structure, improves the PAN structure of the feature fusion module, adds the attention mechanism CBAM and CA structure to the backbone and neck structure, and finally reduces the overall width of the network to the original 3/4, so as to reduce the model parameters and improve the inference speed. Compared with YOLOv4, the algorithm in this paper improves the average accuracy on KITTI dataset by 2.06% and BDD dataset by 2.95%. When the detection accuracy is almost unchanged, the inference speed of this algorithm is increased by 9.14%, and it can detect in real time at a speed of more than 58.47 FPS.

Download Full-text

Rapid Foreign Object Detection System on Seaweed Using VNIR Hyperspectral Imaging

Sensors ◽

10.3390/s21165279 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5279

Author(s):

Dong-Hoon Kwak ◽

Guk-Jin Son ◽

Mi-Kyung Park ◽

Young-Duk Kim

Keyword(s):

Object Detection ◽

Real Time ◽

Hyperspectral Imaging ◽

High Speed ◽

Detection System ◽

Imaging Techniques ◽

Foreign Object ◽

Subtraction Method ◽

Foreign Objects ◽

Time Operation

The consumption of seaweed is increasing year by year worldwide. Therefore, the foreign object inspection of seaweed is becoming increasingly important. Seaweed is mixed with various materials such as laver and sargassum fusiforme. So it has various colors even in the same seaweed. In addition, the surface is uneven and greasy, causing diffuse reflections frequently. For these reasons, it is difficult to detect foreign objects in seaweed, so the accuracy of conventional foreign object detectors used in real manufacturing sites is less than 80%. Supporting real-time inspection should also be considered when inspecting foreign objects. Since seaweed requires mass production, rapid inspection is essential. However, hyperspectral imaging techniques are generally not suitable for high-speed inspection. In this study, we overcome this limitation by using dimensionality reduction and using simplified operations. For accuracy improvement, the proposed algorithm is carried out in 2 stages. Firstly, the subtraction method is used to clearly distinguish seaweed and conveyor belts, and also detect some relatively easy to detect foreign objects. Secondly, a standardization inspection is performed based on the result of the subtraction method. During this process, the proposed scheme adopts simplified and burdenless calculations such as subtraction, division, and one-by-one matching, which achieves both accuracy and low latency performance. In the experiment to evaluate the performance, 60 normal seaweeds and 60 seaweeds containing foreign objects were used, and the accuracy of the proposed algorithm is 95%. Finally, by implementing the proposed algorithm as a foreign object detection platform, it was confirmed that real-time operation in rapid inspection was possible, and the possibility of deployment in real manufacturing sites was confirmed.

Download Full-text

A Set of Single YOLO Modalities to Detect Occluded Entities via Viewpoint Conversion

Applied Sciences ◽

10.3390/app11136016 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6016

Author(s):

Jinsoo Kim ◽

Jeongho Cho

Keyword(s):

Object Detection ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Detection Algorithm ◽

Detection Accuracy ◽

Cloud Data ◽

Detection Techniques ◽

Bounding Boxes ◽

Partially Occluded ◽

Rgb Image

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.

Download Full-text

Research on Lightweight Infrared Pedestrian Detection Model Algorithm for Embedded Platform

Security and Communication Networks ◽

10.1155/2021/1549772 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Zhaoli Wu ◽

Xin Wang ◽

Chao Chen

Keyword(s):

Real Time ◽

Target Detection ◽

Pedestrian Detection ◽

Infrared Image ◽

Far Infrared ◽

Detection Algorithm ◽

Model Parameters ◽

Detection Accuracy ◽

Detection Model ◽

Embedded Platform

Due to the limitation of energy consumption and power consumption, the embedded platform cannot meet the real-time requirements of the far-infrared image pedestrian detection algorithm. To solve this problem, this paper proposes a new real-time infrared pedestrian detection algorithm (RepVGG-YOLOv4, Rep-YOLO), which uses RepVGG to reconstruct the YOLOv4 backbone network, reduces the amount of model parameters and calculations, and improves the speed of target detection; using space spatial pyramid pooling (SPP) obtains different receptive field information to improve the accuracy of model detection; using the channel pruning compression method reduces redundant parameters, model size, and computational complexity. The experimental results show that compared with the YOLOv4 target detection algorithm, the Rep-YOLO algorithm reduces the model volume by 90%, the floating-point calculation is reduced by 93.4%, the reasoning speed is increased by 4 times, and the model detection accuracy after compression reaches 93.25%.

Download Full-text

SSD7-FFAM: A Real-Time Object Detection Network Friendly to Embedded Devices from Scratch

Applied Sciences ◽

10.3390/app11031096 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1096

Author(s):

Qing Li ◽

Yingcheng Lin ◽

Wei He

Keyword(s):

Object Detection ◽

Real Time ◽

Large Scale ◽

Feature Fusion ◽

Contextual Information ◽

Attention Mechanism ◽

Detection Accuracy ◽

Single Shot ◽

Feature Maps ◽

Embedded Devices

The high requirements for computing and memory are the biggest challenges in deploying existing object detection networks to embedded devices. Living lightweight object detectors directly use lightweight neural network architectures such as MobileNet or ShuffleNet pre-trained on large-scale classification datasets, which results in poor network structure flexibility and is not suitable for some specific scenarios. In this paper, we propose a lightweight object detection network Single-Shot MultiBox Detector (SSD)7-Feature Fusion and Attention Mechanism (FFAM), which saves storage space and reduces the amount of calculation by reducing the number of convolutional layers. We offer a novel Feature Fusion and Attention Mechanism (FFAM) method to improve detection accuracy. Firstly, the FFAM method fuses high-level semantic information-rich feature maps with low-level feature maps to improve small objects’ detection accuracy. The lightweight attention mechanism cascaded by channels and spatial attention modules is employed to enhance the target’s contextual information and guide the network to focus on its easy-to-recognize features. The SSD7-FFAM achieves 83.7% mean Average Precision (mAP), 1.66 MB parameters, and 0.033 s average running time on the NWPU VHR-10 dataset. The results indicate that the proposed SSD7-FFAM is more suitable for deployment to embedded devices for real-time object detection.

Download Full-text

Visible-Thermal Image Object Detection via the Combination of Illumination Conditions and Temperature Information

Remote Sensing ◽

10.3390/rs13183656 ◽

2021 ◽

Vol 13 (18) ◽

pp. 3656

Author(s):

Hang Zhou ◽

Min Sun ◽

Xiang Ren ◽

Xiuyuan Wang

Keyword(s):

Object Detection ◽

Feature Fusion ◽

A Priori ◽

Autonomous Driving ◽

Data Sources ◽

A Priori Knowledge ◽

Detection Accuracy ◽

Thermal Images ◽

Deep Learning Network ◽

Priori Knowledge

Object detection plays an important role in autonomous driving, disaster rescue, robot navigation, intelligent video surveillance, and many other fields. Nonetheless, visible images are poor under weak illumination conditions, and thermal infrared images are noisy and have low resolution. Consequently, neither of these two data sources yields satisfactory results when used alone. While some scholars have combined visible and thermal images for object detection, most did not consider the illumination conditions and the different contributions of diverse data sources to the results. In addition, few studies have made use of the temperature characteristics of thermal images. Therefore, in the present study, visible and thermal images are utilized as the dataset, and RetinaNet is used as the baseline to fuse features from different data sources for object detection. Moreover, a dynamic weight fusion method, which is based on channel attention according to different illumination conditions, is used in the fusion component, and the channel attention and a priori temperature mask (CAPTM) module is proposed; the CAPTM can be applied to a deep learning network as a priori knowledge and maximizes the advantage of temperature information from thermal images. The main innovations of the present research include the following: (1) the consideration of different illumination conditions and the use of different fusion parameters for different conditions in the feature fusion of visible and thermal images; (2) the dynamic fusion of different data sources in the feature fusion of visible and thermal images; (3) the use of temperature information as a priori knowledge (CAPTM) in feature extraction. To a certain extent, the proposed methods improve the accuracy of object detection at night or under other weak illumination conditions and with a single data source. Compared with the state-of-the-art (SOTA) method, the proposed method is found to achieve superior detection accuracy with an overall mean average precision (mAP) improvement of 0.69%, including an AP improvement of 2.55% for the detection of the Person category. The results demonstrate the effectiveness of the research methods for object detection, especially temperature information-rich object detection.

Download Full-text

A Lightweight Object Detection Framework for Remote Sensing Images

Remote Sensing ◽

10.3390/rs13040683 ◽

2021 ◽

Vol 13 (4) ◽

pp. 683

Author(s):

Lang Huyan ◽

Yunpeng Bai ◽

Ying Li ◽

Dongmei Jiang ◽

Yanning Zhang ◽

...

Keyword(s):

Remote Sensing ◽

Object Detection ◽

Real Time ◽

Large Scale ◽

Feature Fusion ◽

Computational Cost ◽

Feature Representation ◽

Detection Accuracy ◽

Remote Sensing Images ◽

Low Level

Onboard real-time object detection in remote sensing images is a crucial but challenging task in this computation-constrained scenario. This task not only requires the algorithm to yield excellent performance but also requests limited time and space complexity of the algorithm. However, previous convolutional neural networks (CNN) based object detectors for remote sensing images suffer from heavy computational cost, which hinders them from being deployed on satellites. Moreover, an onboard detector is desired to detect objects at vastly different scales. To address these issues, we proposed a lightweight one-stage multi-scale feature fusion detector called MSF-SNET for onboard real-time object detection of remote sensing images. Using lightweight SNET as the backbone network reduces the number of parameters and computational complexity. To strengthen the detection performance of small objects, three low-level features are extracted from the three stages of SNET respectively. In the detection part, another three convolutional layers are designed to further extract deep features with rich semantic information for large-scale object detection. To improve detection accuracy, the deep features and low-level features are fused to enhance the feature representation. Extensive experiments and comprehensive evaluations on the openly available NWPU VHR-10 dataset and DIOR dataset are conducted to evaluate the proposed method. Compared with other state-of-art detectors, the proposed detection framework has fewer parameters and calculations, while maintaining consistent accuracy.

Download Full-text

Lightweight Fruit-Detection Algorithm for Edge Computing Applications

Frontiers in Plant Science ◽

10.3389/fpls.2021.740936 ◽

2021 ◽

Vol 12 ◽

Author(s):

Wenli Zhang ◽

Yuxin Liu ◽

Kaizhen Chen ◽

Huibin Li ◽

Yulin Duan ◽

...

Keyword(s):

Deep Learning ◽

Real Time ◽

Sampling Method ◽

Feature Fusion ◽

Detection Algorithm ◽

Detection Accuracy ◽

Model Structure ◽

Excellent Performance ◽

Backbone Network ◽

Detection Technology

In recent years, deep-learning-based fruit-detection technology has exhibited excellent performance in modern horticulture research. However, deploying deep learning algorithms in real-time field applications is still challenging, owing to the relatively low image processing capability of edge devices. Such limitations are becoming a new bottleneck and hindering the utilization of AI algorithms in modern horticulture. In this paper, we propose a lightweight fruit-detection algorithm, specifically designed for edge devices. The algorithm is based on Light-CSPNet as the backbone network, an improved feature-extraction module, a down-sampling method, and a feature-fusion module, and it ensures real-time detection on edge devices while maintaining the fruit-detection accuracy. The proposed algorithm was tested on three edge devices: NVIDIA Jetson Xavier NX, NVIDIA Jetson TX2, and NVIDIA Jetson NANO. The experimental results show that the average detection precision of the proposed algorithm for orange, tomato, and apple datasets are 0.93, 0.847, and 0.850, respectively. Deploying the algorithm, the detection speed of NVIDIA Jetson Xavier NX reaches 21.3, 24.8, and 22.2 FPS, while that of NVIDIA Jetson TX2 reaches 13.9, 14.1, and 14.5 FPS and that of NVIDIA Jetson NANO reaches 6.3, 5.0, and 8.5 FPS for the three datasets. Additionally, the proposed algorithm provides a component add/remove function to flexibly adjust the model structure, considering the trade-off between the detection accuracy and speed in practical usage.

Download Full-text

Lightweight Underwater Object Detection Based on YOLO v4 and Multi-Scale Attentional Feature Fusion

Remote Sensing ◽

10.3390/rs13224706 ◽

2021 ◽

Vol 13 (22) ◽

pp. 4706

Author(s):

Minghua Zhang ◽

Shubo Xu ◽

Wei Song ◽

Qi He ◽

Quanmiao Wei

Keyword(s):

Object Detection ◽

Feature Fusion ◽

Past Research ◽

Model Parameters ◽

Detection Accuracy ◽

Marine Environments ◽

Detection Techniques ◽

Model Size ◽

Underwater Object ◽

Small Targets

A challenging and attractive task in computer vision is underwater object detection. Although object detection techniques have achieved good performance in general datasets, problems of low visibility and color bias in the complex underwater environment have led to generally poor image quality; besides this, problems with small targets and target aggregation have led to less extractable information, which makes it difficult to achieve satisfactory results. In past research of underwater object detection based on deep learning, most studies have mainly focused on improving detection accuracy by using large networks; the problem of marine underwater lightweight object detection has rarely gotten attention, which has resulted in a large model size and slow detection speed; as such the application of object detection technologies under marine environments needs better real-time and lightweight performance. In view of this, a lightweight underwater object detection method based on the MobileNet v2, You Only Look Once (YOLO) v4 algorithm and attentional feature fusion has been proposed to address this problem, to produce a harmonious balance between accuracy and speediness for target detection in marine environments. In our work, a combination of MobileNet v2 and depth-wise separable convolution is proposed to reduce the number of model parameters and the size of the model. The Modified Attentional Feature Fusion (AFFM) module aims to better fuse semantic and scale-inconsistent features and to improve accuracy. Experiments indicate that the proposed method obtained a mean average precision (mAP) of 81.67% and 92.65% on the PASCAL VOC dataset and the brackish dataset, respectively, and reached a processing speed of 44.22 frame per second (FPS) on the brackish dataset. Moreover, the number of model parameters and the model size were compressed to 16.76% and 19.53% of YOLO v4, respectively, which achieved a good tradeoff between time and accuracy for underwater object detection.

Download Full-text

Research on Object Detection Algorithm Based on Multilayer Information Fusion

Mathematical Problems in Engineering ◽

10.1155/2020/9076857 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

Bao-Yuan Chen ◽

Yu-Kun Shen ◽

Kun Sun

Keyword(s):

Feature Extraction ◽

Object Detection ◽

Feature Fusion ◽

Basic Feature ◽

Detection Algorithm ◽

Mean Average Precision ◽

Detection Accuracy ◽

Average Precision ◽

Position Information ◽

The Mean

At present, object detectors based on convolution neural networks generally rely on the last layer of features extracted by the feature extraction network. In the process of continuous convolution and pooling of deep features, the position information cannot be completely transferred backward. This paper proposes a multiscale feature reuse detection model, which includes the basic feature extraction network DenseNet, feature fusion network, multiscale anchor region proposal network, and classification and regression network. The fusion of high-dimensional features and low-dimensional features not only strengthens the model's sensitivity to objects of different sizes but also strengthens the transmission of information, so that the feature map has rich deep semantic information and shallow location information at the same time, which significantly improves the robustness and detection accuracy of the model. The algorithm is trained and tested in Pascal VOC2007 dataset. The experimental results show that the mean average precision of the objects in the dataset is 73.87%. At the same time, compared with the mainstream faster RCNN and SSD detection models, the mean average precision of object detection algorithm based on DenseNet is improved by 5.63% and 3.86%, respectively.

Download Full-text

IBMDA: Information based misbehavior detection algorithm for VANET

Journal of High Speed Networks ◽

10.3233/jhs-200638 ◽

2020 ◽

Vol 26 (3) ◽

pp. 185-207

Author(s):

Dinesh Singh ◽

Ranvijay ◽

Rama Shankar Yadav

Keyword(s):

Information Sharing ◽

High Speed ◽

Cluster Head ◽

Detection Algorithm ◽

Detection Accuracy ◽

Misbehavior Detection ◽

Incident Delay ◽

Event Information ◽

Safety Event ◽

On The Road

The safety event information sharing among the vehicles in motion is the primary goal to design the vehicular ad hoc network (VANET). The shared safety event information assists vehicles to avoid road accidents and driving inconvenience. The advantages of safety event information sharing in VANET has become blunt due to the misbehavior of vehicles. The vehicle’s misbehavior like dissemination of false information, reply of bogus messages, etc., can create traffic hazards on the road and may result in the loss of property and human lives. In VANET, the detection of such misbehaving vehicles along with minimum time delay in flooding safety event information (i.e., incident delay) to others is challenging due to the high speed of vehicles. The formation of stable VANET topology is a feasible solution among many to improve the performance of misbehavior detection and reducing incident delay even with high speed of vehicles. In this paper, we propose an information based misbehavior detection algorithm (IBMDA) that effectively works in stable cluster based VANET. Our proposed IBMDA algorithm that runs on the selected cluster head vehicles is used to verify the content of received safety event messages. The identification of vehicles as malicious or non malicious depends on the result of verification at cluster heads. An illustrative example is given to explore our proposed algorithm easily and effectively. The highway scenario is considered to test the performance of our proposed IBMDA algorithm. The simulation is performed with a detailed comparative analysis using ns-3 simulator. It is observed that under the considered scenario, our proposed algorithm improves the misbehavior detection accuracy up to 6.46% and reduces average incident delay approximately up to 14.78% as compared to existing algorithms.

Download Full-text