scholarly journals Real-Time 3D Multi-Object Detection and Localization Based on Deep Learning for Road and Railway Smart Mobility

2021 ◽  
Vol 7 (8) ◽  
pp. 145
Author(s):  
Antoine Mauri ◽  
Redouane Khemmar ◽  
Benoit Decoux ◽  
Madjid Haddad ◽  
Rémi Boutteau

For smart mobility, autonomous vehicles, and advanced driver-assistance systems (ADASs), perception of the environment is an important task in scene analysis and understanding. Better perception of the environment allows for enhanced decision making, which, in turn, enables very high-precision actions. To this end, we introduce in this work a new real-time deep learning approach for 3D multi-object detection for smart mobility not only on roads, but also on railways. To obtain the 3D bounding boxes of the objects, we modified a proven real-time 2D detector, YOLOv3, to predict 3D object localization, object dimensions, and object orientation. Our method has been evaluated on KITTI’s road dataset as well as on our own hybrid virtual road/rail dataset acquired from the video game Grand Theft Auto (GTA) V. The evaluation of our method on these two datasets shows good accuracy, but more importantly that it can be used in real-time conditions, in road and rail traffic environments. Through our experimental results, we also show the importance of the accuracy of prediction of the regions of interest (RoIs) used in the estimation of 3D bounding box parameters.

Author(s):  
F. Particke ◽  
R. Kolbenschlag ◽  
M. Hiller ◽  
L. Patiño-Studencki ◽  
J. Thielecke

Sensors ◽  
2020 ◽  
Vol 20 (2) ◽  
pp. 532 ◽  
Author(s):  
Antoine Mauri ◽  
Redouane Khemmar ◽  
Benoit Decoux ◽  
Nicolas Ragot ◽  
Romain Rossi ◽  
...  

In core computer vision tasks, we have witnessed significant advances in object detection, localisation and tracking. However, there are currently no methods to detect, localize and track objects in road environments, and taking into account real-time constraints. In this paper, our objective is to develop a deep learning multi object detection and tracking technique applied to road smart mobility. Firstly, we propose an effective detector-based on YOLOv3 which we adapt to our context. Subsequently, to localize successfully the detected objects, we put forward an adaptive method aiming to extract 3D information, i.e., depth maps. To do so, a comparative study is carried out taking into account two approaches: Monodepth2 for monocular vision and MADNEt for stereoscopic vision. These approaches are then evaluated over datasets containing depth information in order to discern the best solution that performs better in real-time conditions. Object tracking is necessary in order to mitigate the risks of collisions. Unlike traditional tracking approaches which require target initialization beforehand, our approach consists of using information from object detection and distance estimation to initialize targets and to track them later. Expressly, we propose here to improve SORT approach for 3D object tracking. We introduce an extended Kalman filter to better estimate the position of objects. Extensive experiments carried out on KITTI dataset prove that our proposal outperforms state-of-the-art approches.


2021 ◽  
Vol 11 (13) ◽  
pp. 6016
Author(s):  
Jinsoo Kim ◽  
Jeongho Cho

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.


2020 ◽  
Vol 13 (1) ◽  
pp. 89
Author(s):  
Manuel Carranza-García ◽  
Jesús Torres-Mateo ◽  
Pedro Lara-Benítez ◽  
Jorge García-Gutiérrez

Object detection using remote sensing data is a key task of the perception systems of self-driving vehicles. While many generic deep learning architectures have been proposed for this problem, there is little guidance on their suitability when using them in a particular scenario such as autonomous driving. In this work, we aim to assess the performance of existing 2D detection systems on a multi-class problem (vehicles, pedestrians, and cyclists) with images obtained from the on-board camera sensors of a car. We evaluate several one-stage (RetinaNet, FCOS, and YOLOv3) and two-stage (Faster R-CNN) deep learning meta-architectures under different image resolutions and feature extractors (ResNet, ResNeXt, Res2Net, DarkNet, and MobileNet). These models are trained using transfer learning and compared in terms of both precision and efficiency, with special attention to the real-time requirements of this context. For the experimental study, we use the Waymo Open Dataset, which is the largest existing benchmark. Despite the rising popularity of one-stage detectors, our findings show that two-stage detectors still provide the most robust performance. Faster R-CNN models outperform one-stage detectors in accuracy, being also more reliable in the detection of minority classes. Faster R-CNN Res2Net-101 achieves the best speed/accuracy tradeoff but needs lower resolution images to reach real-time speed. Furthermore, the anchor-free FCOS detector is a slightly faster alternative to RetinaNet, with similar precision and lower memory usage.


2021 ◽  
Vol 336 ◽  
pp. 07004
Author(s):  
Ruoyu Fang ◽  
Cheng Cai

Obstacle detection and target tracking are two major issues for intelligent autonomous vehicles. This paper proposes a new scheme to achieve target tracking and real-time obstacle detection of obstacles based on computer vision. ResNet-18 deep learning neural network is utilized for obstacle detection and Yolo-v3 deep learning neural network is employed for real-time target tracking. These two trained models can be deployed on an autonomous vehicle equipped with an NVIDIA Jetson Nano motherboard. The autonomous vehicle moves to avoid obstacles and follow tracked targets by camera. Adjusting the steering and movement of the autonomous vehicle according to the PID algorithm during the movement, therefore, will help the proposed vehicle achieve stable and precise tracking.


2021 ◽  
Vol 23 (06) ◽  
pp. 47-57
Author(s):  
Aditya Kulkarni ◽  
◽  
Manali Munot ◽  
Sai Salunkhe ◽  
Shubham Mhaske ◽  
...  

With the development in technologies right from serial to parallel computing, GPU, AI, and deep learning models a series of tools to process complex images have been developed. The main focus of this research is to compare various algorithms(pre-trained models) and their contributions to process complex images in terms of performance, accuracy, time, and their limitations. The pre-trained models we are using are CNN, R-CNN, R-FCN, and YOLO. These models are python language-based and use libraries like TensorFlow, OpenCV, and free image databases (Microsoft COCO and PAS-CAL VOC 2007/2012). These not only aim at object detection but also on building bounding boxes around appropriate locations. Thus, by this review, we get a better vision of these models and their performance and a good idea of which models are ideal for various situations.


Sign in / Sign up

Export Citation Format

Share Document