Real-Time 3D Multi-Object Detection and Localization Based on Deep Learning for Road and Railway Smart Mobility

Antoine Mauri; Redouane Khemmar; Benoit Decoux; Madjid Haddad; Rémi Boutteau

doi:10.3390/jimaging7080145

Real-Time 3D Multi-Object Detection and Localization Based on Deep Learning for Road and Railway Smart Mobility

Journal of Imaging ◽

10.3390/jimaging7080145 ◽

2021 ◽

Vol 7 (8) ◽

pp. 145

Author(s):

Antoine Mauri ◽

Redouane Khemmar ◽

Benoit Decoux ◽

Madjid Haddad ◽

Rémi Boutteau

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time ◽

Video Game ◽

Autonomous Vehicles ◽

Object Localization ◽

Driver Assistance Systems ◽

Smart Mobility ◽

Bounding Boxes ◽

Detection And Localization

For smart mobility, autonomous vehicles, and advanced driver-assistance systems (ADASs), perception of the environment is an important task in scene analysis and understanding. Better perception of the environment allows for enhanced decision making, which, in turn, enables very high-precision actions. To this end, we introduce in this work a new real-time deep learning approach for 3D multi-object detection for smart mobility not only on roads, but also on railways. To obtain the 3D bounding boxes of the objects, we modified a proven real-time 2D detector, YOLOv3, to predict 3D object localization, object dimensions, and object orientation. Our method has been evaluated on KITTI’s road dataset as well as on our own hybrid virtual road/rail dataset acquired from the video game Grand Theft Auto (GTA) V. The evaluation of our method on these two datasets shows good accuracy, but more importantly that it can be used in real-time conditions, in road and rail traffic environments. Through our experimental results, we also show the importance of the accuracy of prediction of the regions of interest (RoIs) used in the estimation of 3D bounding box parameters.

Download Full-text

Real Time Object Detection, Tracking, and Distance and Motion Estimation based on Deep Learning: Application to Smart Mobility

2019 Eighth International Conference on Emerging Security Technologies (EST) ◽

10.1109/est.2019.8806222 ◽

2019 ◽

Cited By ~ 4

Author(s):

Zhihao Chen ◽

Redouane Khemmar ◽

Benoit Decoux ◽

Amphani Atahouet ◽

Jean-Yves Ertaud

Keyword(s):

Deep Learning ◽

Motion Estimation ◽

Object Detection ◽

Real Time ◽

Smart Mobility

Download Full-text

Deep Learning for Real-Time Capable Object Detection and Localization on Mobile Platforms

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/261/1/012005 ◽

2017 ◽

Vol 261 ◽

pp. 012005 ◽

Cited By ~ 2

Author(s):

F. Particke ◽

R. Kolbenschlag ◽

M. Hiller ◽

L. Patiño-Studencki ◽

J. Thielecke

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time ◽

Mobile Platforms ◽

Detection And Localization

Download Full-text

Deep Learning for Real-Time 3D Multi-Object Detection, Localisation, and Tracking: Application to Smart Mobility

Sensors ◽

10.3390/s20020532 ◽

2020 ◽

Vol 20 (2) ◽

pp. 532 ◽

Cited By ~ 5

Author(s):

Antoine Mauri ◽

Redouane Khemmar ◽

Benoit Decoux ◽

Nicolas Ragot ◽

Romain Rossi ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Object Tracking ◽

Real Time ◽

Distance Estimation ◽

Adaptive Method ◽

Stereoscopic Vision ◽

Depth Information ◽

Smart Mobility ◽

3D Information

In core computer vision tasks, we have witnessed significant advances in object detection, localisation and tracking. However, there are currently no methods to detect, localize and track objects in road environments, and taking into account real-time constraints. In this paper, our objective is to develop a deep learning multi object detection and tracking technique applied to road smart mobility. Firstly, we propose an effective detector-based on YOLOv3 which we adapt to our context. Subsequently, to localize successfully the detected objects, we put forward an adaptive method aiming to extract 3D information, i.e., depth maps. To do so, a comparative study is carried out taking into account two approaches: Monodepth2 for monocular vision and MADNEt for stereoscopic vision. These approaches are then evaluated over datasets containing depth information in order to discern the best solution that performs better in real-time conditions. Object tracking is necessary in order to mitigate the risks of collisions. Unlike traditional tracking approaches which require target initialization beforehand, our approach consists of using information from object detection and distance estimation to initialize targets and to track them later. Expressly, we propose here to improve SORT approach for 3D object tracking. We introduce an extended Kalman filter to better estimate the position of objects. Extensive experiments carried out on KITTI dataset prove that our proposal outperforms state-of-the-art approches.

Download Full-text

Real-Time Object Detection and Localization for Vision-Based Robot Manipulator

SN Computer Science ◽

10.1007/s42979-021-00561-4 ◽

2021 ◽

Vol 2 (3) ◽

Author(s):

Varun Batra ◽

Vijay Kumar

Keyword(s):

Object Detection ◽

Real Time ◽

Robot Manipulator ◽

Detection And Localization

Download Full-text

Real-Time Deep Learning-Based Object Detection Framework

2020 IEEE Symposium Series on Computational Intelligence (SSCI) ◽

10.1109/ssci47803.2020.9308493 ◽

2020 ◽

Author(s):

William Tarimo ◽

Moustafa M.Sabra ◽

Shonan Hendre

Keyword(s):

Deep Learning ◽

Object Detection ◽

Real Time

Download Full-text

A Set of Single YOLO Modalities to Detect Occluded Entities via Viewpoint Conversion

Applied Sciences ◽

10.3390/app11136016 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6016

Author(s):

Jinsoo Kim ◽

Jeongho Cho

Keyword(s):

Object Detection ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Detection Algorithm ◽

Detection Accuracy ◽

Cloud Data ◽

Detection Techniques ◽

Bounding Boxes ◽

Partially Occluded ◽

Rgb Image

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.

Download Full-text

On the Performance of One-Stage and Two-Stage Object Detectors in Autonomous Vehicles Using Camera Data

Remote Sensing ◽

10.3390/rs13010089 ◽

2020 ◽

Vol 13 (1) ◽

pp. 89

Author(s):

Manuel Carranza-García ◽

Jesús Torres-Mateo ◽

Pedro Lara-Benítez ◽

Jorge García-Gutiérrez

Keyword(s):

Deep Learning ◽

Real Time ◽

Autonomous Vehicles ◽

Remote Sensing Data ◽

Autonomous Driving ◽

Two Stage ◽

Detection Systems ◽

One Stage ◽

Time Requirements ◽

Speed Accuracy

Object detection using remote sensing data is a key task of the perception systems of self-driving vehicles. While many generic deep learning architectures have been proposed for this problem, there is little guidance on their suitability when using them in a particular scenario such as autonomous driving. In this work, we aim to assess the performance of existing 2D detection systems on a multi-class problem (vehicles, pedestrians, and cyclists) with images obtained from the on-board camera sensors of a car. We evaluate several one-stage (RetinaNet, FCOS, and YOLOv3) and two-stage (Faster R-CNN) deep learning meta-architectures under different image resolutions and feature extractors (ResNet, ResNeXt, Res2Net, DarkNet, and MobileNet). These models are trained using transfer learning and compared in terms of both precision and efficiency, with special attention to the real-time requirements of this context. For the experimental study, we use the Waymo Open Dataset, which is the largest existing benchmark. Despite the rising popularity of one-stage detectors, our findings show that two-stage detectors still provide the most robust performance. Faster R-CNN models outperform one-stage detectors in accuracy, being also more reliable in the detection of minority classes. Faster R-CNN Res2Net-101 achieves the best speed/accuracy tradeoff but needs lower resolution images to reach real-time speed. Furthermore, the anchor-free FCOS detector is a slightly faster alternative to RetinaNet, with similar precision and lower memory usage.

Download Full-text

Object Detection with Neural Models, Deep Learning and Common Sense to Aid Smart Mobility

2018 IEEE 30th International Conference on Tools with Artificial Intelligence (ICTAI) ◽

10.1109/ictai.2018.00134 ◽

2018 ◽

Cited By ~ 6

Author(s):

Abidha Pandey ◽

Manish Puri ◽

Aparna Varde

Keyword(s):

Deep Learning ◽

Object Detection ◽

Common Sense ◽

Neural Models ◽

Smart Mobility

Download Full-text

Computer vision based obstacle detection and target tracking for autonomous vehicles

MATEC Web of Conferences ◽

10.1051/matecconf/202133607004 ◽

2021 ◽

Vol 336 ◽

pp. 07004

Author(s):

Ruoyu Fang ◽

Cheng Cai

Keyword(s):

Neural Network ◽

Computer Vision ◽

Deep Learning ◽

Target Tracking ◽

Real Time ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Obstacle Detection ◽

Pid Algorithm ◽

Deep Learning Neural Network

Obstacle detection and target tracking are two major issues for intelligent autonomous vehicles. This paper proposes a new scheme to achieve target tracking and real-time obstacle detection of obstacles based on computer vision. ResNet-18 deep learning neural network is utilized for obstacle detection and Yolo-v3 deep learning neural network is employed for real-time target tracking. These two trained models can be deployed on an autonomous vehicle equipped with an NVIDIA Jetson Nano motherboard. The autonomous vehicle moves to avoid obstacles and follow tracked targets by camera. Adjusting the steering and movement of the autonomous vehicle according to the PID algorithm during the movement, therefore, will help the proposed vehicle achieve stable and precise tracking.

Download Full-text

A Survey on Various Available Object Detection Models and Application In Automatic License Plate Detection

Journal of University of Shanghai for Science and Technology ◽

10.51201/jusst/21/05222 ◽

2021 ◽

Vol 23 (06) ◽

pp. 47-57

Author(s):

Aditya Kulkarni ◽

◽

Manali Munot ◽

Sai Salunkhe ◽

Shubham Mhaske ◽

...

Keyword(s):

Deep Learning ◽

Object Detection ◽

Image Databases ◽

License Plate ◽

Learning Models ◽

Python Language ◽

Performance Accuracy ◽

License Plate Detection ◽

Bounding Boxes ◽

Complex Images

With the development in technologies right from serial to parallel computing, GPU, AI, and deep learning models a series of tools to process complex images have been developed. The main focus of this research is to compare various algorithms(pre-trained models) and their contributions to process complex images in terms of performance, accuracy, time, and their limitations. The pre-trained models we are using are CNN, R-CNN, R-FCN, and YOLO. These models are python language-based and use libraries like TensorFlow, OpenCV, and free image databases (Microsoft COCO and PAS-CAL VOC 2007/2012). These not only aim at object detection but also on building bounding boxes around appropriate locations. Thus, by this review, we get a better vision of these models and their performance and a good idea of which models are ideal for various situations.

Download Full-text