scholarly journals A Set of Single YOLO Modalities to Detect Occluded Entities via Viewpoint Conversion

2021 ◽  
Vol 11 (13) ◽  
pp. 6016
Author(s):  
Jinsoo Kim ◽  
Jeongho Cho

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.

Author(s):  
Liang Peng ◽  
Hong Wang ◽  
Jun Li

AbstractThe safety of the intended functionality (SOTIF) has become one of the hottest topics in the field of autonomous driving. However, no testing and evaluating system for SOTIF performance has been proposed yet. Therefore, this paper proposes a framework based on the advanced You Only Look Once (YOLO) algorithm and the mean Average Precision (mAP) method to evaluate the object detection performance of the camera under SOTIF-related scenarios. First, a dataset is established, which contains road images with extreme weather and adverse lighting conditions. Second, the Monte Carlo dropout (MCD) method is used to analyze the uncertainty of the algorithm and draw the uncertainty region of the predicted bounding box. Then, the confidence of the algorithm is calibrated based on uncertainty results so that the average confidence after calibration can better reflect the real accuracy. The uncertainty results and the calibrated confidence are proposed to be used for online risk identification. Finally, the confusion matrix is extended according to the several possible mistakes that the object detection algorithm may make, and then the mAP is calculated as an index for offline evaluation and comparison. This paper offers suggestions to apply the MCD method to complex object detection algorithms and to find the relationship between the uncertainty and the confidence of the algorithm. The experimental results verified by specific SOTIF scenarios proof the feasibility and effectiveness of the proposed uncertainty acquisition approach for object detection algorithm, which provides potential practical implementation chance to address perceptual related SOTIF risk for autonomous vehicles.


2021 ◽  
Vol 12 (1) ◽  
pp. 281
Author(s):  
Jaesung Jang ◽  
Hyeongyu Lee ◽  
Jong-Chan Kim

For safe autonomous driving, deep neural network (DNN)-based perception systems play essential roles, where a vast amount of driving images should be manually collected and labeled with ground truth (GT) for training and validation purposes. After observing the manual GT generation’s high cost and unavoidable human errors, this study presents an open-source automatic GT generation tool, CarFree, based on the Carla autonomous driving simulator. By that, we aim to democratize the daunting task of (in particular) object detection dataset generation, which was only possible by big companies or institutes due to its high cost. CarFree comprises (i) a data extraction client that automatically collects relevant information from the Carla simulator’s server and (ii) a post-processing software that produces precise 2D bounding boxes of vehicles and pedestrians on the gathered driving images. Our evaluation results show that CarFree can generate a considerable amount of realistic driving images along with their GTs in a reasonable time. Moreover, using the synthesized training images with artificially made unusual weather and lighting conditions, which are difficult to obtain in real-world driving scenarios, CarFree significantly improves the object detection accuracy in the real world, particularly in the case of harsh environments. With CarFree, we expect its users to generate a variety of object detection datasets in hassle-free ways.


2021 ◽  
Vol 11 (24) ◽  
pp. 11630
Author(s):  
Yan Zhou ◽  
Sijie Wen ◽  
Dongli Wang ◽  
Jinzhen Mu ◽  
Irampaye Richard

Object detection is one of the key algorithms in automatic driving systems. Aiming at addressing the problem of false detection and the missed detection of both small and occluded objects in automatic driving scenarios, an improved Faster-RCNN object detection algorithm is proposed. First, deformable convolution and a spatial attention mechanism are used to improve the ResNet-50 backbone network to enhance the feature extraction of small objects; then, an improved feature pyramid structure is introduced to reduce the loss of features in the fusion process. Three cascade detectors are introduced to solve the problem of IOU (Intersection-Over-Union) threshold mismatch, and side-aware boundary localization is applied for frame regression. Finally, Soft-NMS (Soft Non-maximum Suppression) is used to remove bounding boxes to obtain the best results. The experimental results show that the improved Faster-RCNN can better detect small objects and occluded objects, and its accuracy is 7.7% and 4.1% respectively higher than that of the baseline in the eight categories selected from the COCO2017 and BDD100k data sets.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Rui Wang ◽  
Ziyue Wang ◽  
Zhengwei Xu ◽  
Chi Wang ◽  
Qiang Li ◽  
...  

Object detection is an important part of autonomous driving technology. To ensure the safe running of vehicles at high speed, real-time and accurate detection of all the objects on the road is required. How to balance the speed and accuracy of detection is a hot research topic in recent years. This paper puts forward a one-stage object detection algorithm based on YOLOv4, which improves the detection accuracy and supports real-time operation. The backbone of the algorithm doubles the stacking times of the last residual block of CSPDarkNet53. The neck of the algorithm replaces the SPP with the RFB structure, improves the PAN structure of the feature fusion module, adds the attention mechanism CBAM and CA structure to the backbone and neck structure, and finally reduces the overall width of the network to the original 3/4, so as to reduce the model parameters and improve the inference speed. Compared with YOLOv4, the algorithm in this paper improves the average accuracy on KITTI dataset by 2.06% and BDD dataset by 2.95%. When the detection accuracy is almost unchanged, the inference speed of this algorithm is increased by 9.14%, and it can detect in real time at a speed of more than 58.47 FPS.


Author(s):  
Michael Person ◽  
Mathew Jensen ◽  
Anthony O. Smith ◽  
Hector Gutierrez

In order for autonomous vehicles to safely navigate the road ways, accurate object detection must take place before safe path planning can occur. Currently, general purpose object detection convolutional neural network (CNN) models have the highest detection accuracies of any method. However, there is a gap in the proposed detection frameworks. Specifically, those that provide high detection accuracy necessary for deployment but do not perform inference in realtime, and those that perform inference in realtime but detection accuracy is low. We propose multimodel fusion detection system (MFDS), a sensor fusion system that combines the speed of a fast image detection CNN model along with the accuracy of light detection and range (LiDAR) point cloud data through a decision tree approach. The primary objective is to bridge the tradeoff between performance and accuracy. The motivation for MFDS is to reduce the computational complexity associated with using a CNN model to extract features from an image. To improve efficiency, MFDS extracts complimentary features from the LiDAR point cloud in order to obtain comparable detection accuracy. MFDS is novel by not only using the image detections to aid three-dimensional (3D) LiDAR detection but also using the LiDAR data to jointly bolster the image detections and provide 3D detections. MFDS achieves 3.7% higher accuracy than the base CNN detection model and is able to operate at 10 Hz. Additionally, the memory requirement for MFDS is small enough to fit on the Nvidia Tx1 when deployed on an embedded device.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2140
Author(s):  
De Jong Yeong ◽  
Gustavo Velasco-Hernandez ◽  
John Barry ◽  
Joseph Walsh

With the significant advancement of sensor and communication technology and the reliable application of obstacle detection techniques and algorithms, automated driving is becoming a pivotal technology that can revolutionize the future of transportation and mobility. Sensors are fundamental to the perception of vehicle surroundings in an automated driving system, and the use and performance of multiple integrated sensors can directly determine the safety and feasibility of automated driving vehicles. Sensor calibration is the foundation block of any autonomous system and its constituent sensors and must be performed correctly before sensor fusion and obstacle detection processes may be implemented. This paper evaluates the capabilities and the technical performance of sensors which are commonly employed in autonomous vehicles, primarily focusing on a large selection of vision cameras, LiDAR sensors, and radar sensors and the various conditions in which such sensors may operate in practice. We present an overview of the three primary categories of sensor calibration and review existing open-source calibration packages for multi-sensor calibration and their compatibility with numerous commercial sensors. We also summarize the three main approaches to sensor fusion and review current state-of-the-art multi-sensor fusion techniques and algorithms for object detection in autonomous driving applications. The current paper, therefore, provides an end-to-end review of the hardware and software methods required for sensor fusion object detection. We conclude by highlighting some of the challenges in the sensor fusion field and propose possible future research directions for automated driving systems.


2021 ◽  
Vol 11 (8) ◽  
pp. 3531
Author(s):  
Hesham M. Eraqi ◽  
Karim Soliman ◽  
Dalia Said ◽  
Omar R. Elezaby ◽  
Mohamed N. Moustafa ◽  
...  

Extensive research efforts have been devoted to identify and improve roadway features that impact safety. Maintaining roadway safety features relies on costly manual operations of regular road surveying and data analysis. This paper introduces an automatic roadway safety features detection approach, which harnesses the potential of artificial intelligence (AI) computer vision to make the process more efficient and less costly. Given a front-facing camera and a global positioning system (GPS) sensor, the proposed system automatically evaluates ten roadway safety features. The system is composed of an oriented (or rotated) object detection model, which solves an orientation encoding discontinuity problem to improve detection accuracy, and a rule-based roadway safety evaluation module. To train and validate the proposed model, a fully-annotated dataset for roadway safety features extraction was collected covering 473 km of roads. The proposed method baseline results are found encouraging when compared to the state-of-the-art models. Different oriented object detection strategies are presented and discussed, and the developed model resulted in improving the mean average precision (mAP) by 16.9% when compared with the literature. The roadway safety feature average prediction accuracy is 84.39% and ranges between 91.11% and 63.12%. The introduced model can pervasively enable/disable autonomous driving (AD) based on safety features of the road; and empower connected vehicles (CV) to send and receive estimated safety features, alerting drivers about black spots or relatively less-safe segments or roads.


Electronics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 1235
Author(s):  
Yang Yang ◽  
Hongmin Deng

In order to make the classification and regression of single-stage detectors more accurate, an object detection algorithm named Global Context You-Only-Look-Once v3 (GC-YOLOv3) is proposed based on the You-Only-Look-Once (YOLO) in this paper. Firstly, a better cascading model with learnable semantic fusion between a feature extraction network and a feature pyramid network is designed to improve detection accuracy using a global context block. Secondly, the information to be retained is screened by combining three different scaling feature maps together. Finally, a global self-attention mechanism is used to highlight the useful information of feature maps while suppressing irrelevant information. Experiments show that our GC-YOLOv3 reaches a maximum of 55.5 object detection mean Average Precision (mAP)@0.5 on Common Objects in Context (COCO) 2017 test-dev and that the mAP is 5.1% higher than that of the YOLOv3 algorithm on Pascal Visual Object Classes (PASCAL VOC) 2007 test set. Therefore, experiments indicate that the proposed GC-YOLOv3 model exhibits optimal performance on the PASCAL VOC and COCO datasets.


2019 ◽  
Vol 2019 ◽  
pp. 1-9 ◽  
Author(s):  
Hai Wang ◽  
Xinyu Lou ◽  
Yingfeng Cai ◽  
Yicheng Li ◽  
Long Chen

Vehicle detection is one of the most important environment perception tasks for autonomous vehicles. The traditional vision-based vehicle detection methods are not accurate enough especially for small and occluded targets, while the light detection and ranging- (lidar-) based methods are good in detecting obstacles but they are time-consuming and have a low classification rate for different target types. Focusing on these shortcomings to make the full use of the advantages of the depth information of lidar and the obstacle classification ability of vision, this work proposes a real-time vehicle detection algorithm which fuses vision and lidar point cloud information. Firstly, the obstacles are detected by the grid projection method using the lidar point cloud information. Then, the obstacles are mapped to the image to get several separated regions of interest (ROIs). After that, the ROIs are expanded based on the dynamic threshold and merged to generate the final ROI. Finally, a deep learning method named You Only Look Once (YOLO) is applied on the ROI to detect vehicles. The experimental results on the KITTI dataset demonstrate that the proposed algorithm has high detection accuracy and good real-time performance. Compared with the detection method based only on the YOLO deep learning, the mean average precision (mAP) is increased by 17%.


2020 ◽  
Vol 34 (07) ◽  
pp. 10478-10485 ◽  
Author(s):  
Yingjie Cai ◽  
Buyu Li ◽  
Zeyu Jiao ◽  
Hongsheng Li ◽  
Xingyu Zeng ◽  
...  

Monocular 3D object detection task aims to predict the 3D bounding boxes of objects based on monocular RGB images. Since the location recovery in 3D space is quite difficult on account of absence of depth information, this paper proposes a novel unified framework which decomposes the detection problem into a structured polygon prediction task and a depth recovery task. Different from the widely studied 2D bounding boxes, the proposed novel structured polygon in the 2D image consists of several projected surfaces of the target object. Compared to the widely-used 3D bounding box proposals, it is shown to be a better representation for 3D detection. In order to inversely project the predicted 2D structured polygon to a cuboid in the 3D physical world, the following depth recovery task uses the object height prior to complete the inverse projection transformation with the given camera projection matrix. Moreover, a fine-grained 3D box refinement scheme is proposed to further rectify the 3D detection results. Experiments are conducted on the challenging KITTI benchmark, in which our method achieves state-of-the-art detection accuracy.


Sign in / Sign up

Export Citation Format

Share Document