CarFree: Hassle-Free Object Detection Dataset Generation Using Carla Autonomous Driving Simulator

For safe autonomous driving, deep neural network (DNN)-based perception systems play essential roles, where a vast amount of driving images should be manually collected and labeled with ground truth (GT) for training and validation purposes. After observing the manual GT generation’s high cost and unavoidable human errors, this study presents an open-source automatic GT generation tool, CarFree, based on the Carla autonomous driving simulator. By that, we aim to democratize the daunting task of (in particular) object detection dataset generation, which was only possible by big companies or institutes due to its high cost. CarFree comprises (i) a data extraction client that automatically collects relevant information from the Carla simulator’s server and (ii) a post-processing software that produces precise 2D bounding boxes of vehicles and pedestrians on the gathered driving images. Our evaluation results show that CarFree can generate a considerable amount of realistic driving images along with their GTs in a reasonable time. Moreover, using the synthesized training images with artificially made unusual weather and lighting conditions, which are difficult to obtain in real-world driving scenarios, CarFree significantly improves the object detection accuracy in the real world, particularly in the case of harsh environments. With CarFree, we expect its users to generate a variety of object detection datasets in hassle-free ways.

Download Full-text

A Set of Single YOLO Modalities to Detect Occluded Entities via Viewpoint Conversion

Applied Sciences ◽

10.3390/app11136016 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6016

Author(s):

Jinsoo Kim ◽

Jeongho Cho

Keyword(s):

Object Detection ◽

Autonomous Vehicles ◽

Autonomous Driving ◽

Detection Algorithm ◽

Detection Accuracy ◽

Cloud Data ◽

Detection Techniques ◽

Bounding Boxes ◽

Partially Occluded ◽

Rgb Image

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.

Download Full-text

YOLO MDE: Object Detection with Monocular Depth Estimation

Electronics ◽

10.3390/electronics11010076 ◽

2021 ◽

Vol 11 (1) ◽

pp. 76

Author(s):

Jongsub Yu ◽

Hyukdoo Choi

Keyword(s):

Risk Assessment ◽

Object Detection ◽

Network Architecture ◽

Ground Truth ◽

Depth Estimation ◽

Autonomous Driving ◽

Depth Prediction ◽

Bounding Box ◽

Monocular Depth ◽

Bounding Boxes

This paper presents an object detector with depth estimation using monocular camera images. Previous detection studies have typically focused on detecting objects with 2D or 3D bounding boxes. A 3D bounding box consists of the center point, its size parameters, and heading information. However, predicting complex output compositions leads a model to have generally low performances, and it is not necessary for risk assessment for autonomous driving. We focused on predicting a single depth per object, which is essential for risk assessment for autonomous driving. Our network architecture is based on YOLO v4, which is a fast and accurate one-stage object detector. We added an additional channel to the output layer for depth estimation. To train depth prediction, we extract the closest depth from the 3D bounding box coordinates of ground truth labels in the dataset. Our model is compared with the latest studies on 3D object detection using the KITTI object detection benchmark. As a result, we show that our model achieves higher detection performance and detection speed than existing models with comparable depth accuracy.

Download Full-text

Automatic Roadway Features Detection with Oriented Object Detection

Applied Sciences ◽

10.3390/app11083531 ◽

2021 ◽

Vol 11 (8) ◽

pp. 3531

Author(s):

Hesham M. Eraqi ◽

Karim Soliman ◽

Dalia Said ◽

Omar R. Elezaby ◽

Mohamed N. Moustafa ◽

...

Keyword(s):

Object Detection ◽

Safety Evaluation ◽

Autonomous Driving ◽

Detection Accuracy ◽

The Road ◽

Detection Model ◽

Detection Approach ◽

Roadway Safety ◽

Safety Features ◽

Oriented Object

Extensive research efforts have been devoted to identify and improve roadway features that impact safety. Maintaining roadway safety features relies on costly manual operations of regular road surveying and data analysis. This paper introduces an automatic roadway safety features detection approach, which harnesses the potential of artificial intelligence (AI) computer vision to make the process more efficient and less costly. Given a front-facing camera and a global positioning system (GPS) sensor, the proposed system automatically evaluates ten roadway safety features. The system is composed of an oriented (or rotated) object detection model, which solves an orientation encoding discontinuity problem to improve detection accuracy, and a rule-based roadway safety evaluation module. To train and validate the proposed model, a fully-annotated dataset for roadway safety features extraction was collected covering 473 km of roads. The proposed method baseline results are found encouraging when compared to the state-of-the-art models. Different oriented object detection strategies are presented and discussed, and the developed model resulted in improving the mean average precision (mAP) by 16.9% when compared with the literature. The roadway safety feature average prediction accuracy is 84.39% and ranges between 91.11% and 63.12%. The introduced model can pervasively enable/disable autonomous driving (AD) based on safety features of the road; and empower connected vehicles (CV) to send and receive estimated safety features, alerting drivers about black spots or relatively less-safe segments or roads.

Download Full-text

Hot Anchors: A Heuristic Anchors Sampling Method in RCNN-Based Object Detection

Sensors ◽

10.3390/s18103415 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3415 ◽

Cited By ~ 2

Author(s):

Jinpeng Zhang ◽

Jinming Zhang ◽

Shan Yu

Keyword(s):

Object Detection ◽

Sampling Method ◽

Ground Truth ◽

Detection Accuracy ◽

Two Stage ◽

Baseline Model ◽

Detection Algorithms ◽

Imbalance Problem ◽

Image Object Detection ◽

Image Object

In the image object detection task, a huge number of candidate boxes are generated to match with a relatively very small amount of ground-truth boxes, and through this method the learning samples can be created. But in fact the vast majority of the candidate boxes do not contain valid object instances and should be recognized and rejected during the training and evaluation of the network. This leads to extra high computation burden and a serious imbalance problem between object and none-object samples, thereby impeding the algorithm’s performance. Here we propose a new heuristic sampling method to generate candidate boxes for two-stage detection algorithms. It is generally applicable to the current two-stage detection algorithms to improve their detection performance. Experiments on COCO dataset showed that, relative to the baseline model, this new method could significantly increase the detection accuracy and efficiency.

Download Full-text

A Research Paper on Traffic Sign Recognition with Machine Learning and IOT

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.35309 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 1774-1778

Author(s):

Mahesh Singh

Keyword(s):

Machine Learning ◽

Object Detection ◽

Real World ◽

Research Paper ◽

Raspberry Pi ◽

Detection Accuracy ◽

Great Success ◽

The Real ◽

Road Traffic Safety ◽

Execution Speed

This paper will help to bring out some amazing findings about autonomous prediction and performing action by establishing a connection between the real world with machine learning and Internet Of thing. The purpose of this research paper is to perform our machine to analyze different signs in the real world and act accordingly. We have explored and found detection of several features in our model which helped us to establish a better interaction of our model with the surroundings. Our algorithms give very optimized predictions performing the right action .Nowadays, autonomous vehicles are a great area of research where we can make it more optimized and more multi - performing .This paper contributes to a huge survey of varied object detection and feature extraction techniques. At the moment, there are loads of object classification and recognition techniques and algorithms found and developed around the world. TSD research is of great significance for improving road traffic safety. In recent years, CNN (Convolutional Neural Networks) have achieved great success in object detection tasks. It shows better accuracy or faster execution speed than traditional methods. However, the execution speed and the detection accuracy of the existing CNN methods cannot be obtained at the same time. What's more, the hardware requirements are also higher than before, resulting in a larger detection cost. In order to solve these problems, this paper proposes an improved algorithm based on convolutional model A classic robot which uses this algorithm which is installed through raspberry pi and performs dedicated action.

Download Full-text

Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6618 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10478-10485 ◽

Cited By ~ 3

Author(s):

Yingjie Cai ◽

Buyu Li ◽

Zeyu Jiao ◽

Hongsheng Li ◽

Xingyu Zeng ◽

...

Keyword(s):

Object Detection ◽

Depth Estimation ◽

Physical World ◽

Depth Information ◽

Detection Accuracy ◽

Unified Framework ◽

3D Object ◽

Depth Recovery ◽

Bounding Boxes ◽

3D Object Detection

Monocular 3D object detection task aims to predict the 3D bounding boxes of objects based on monocular RGB images. Since the location recovery in 3D space is quite difficult on account of absence of depth information, this paper proposes a novel unified framework which decomposes the detection problem into a structured polygon prediction task and a depth recovery task. Different from the widely studied 2D bounding boxes, the proposed novel structured polygon in the 2D image consists of several projected surfaces of the target object. Compared to the widely-used 3D bounding box proposals, it is shown to be a better representation for 3D detection. In order to inversely project the predicted 2D structured polygon to a cuboid in the 3D physical world, the following depth recovery task uses the object height prior to complete the inverse projection transformation with the given camera projection matrix. Moreover, a fine-grained 3D box refinement scheme is proposed to further rectify the 3D detection results. Experiments are conducted on the challenging KITTI benchmark, in which our method achieves state-of-the-art detection accuracy.

Download Full-text

IoU-Adaptive Deformable R-CNN: Make Full Use of IoU for Multi-Class Object Detection in Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs11030286 ◽

2019 ◽

Vol 11 (3) ◽

pp. 286 ◽

Cited By ~ 24

Author(s):

Jiangqiao Yan ◽

Hongqi Wang ◽

Menglong Yan ◽

Wenhui Diao ◽

Xian Sun ◽

...

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Object Detection ◽

Convolutional Neural Network ◽

State Of The Art ◽

Ground Truth ◽

Detection Performance ◽

Candidate Region ◽

Detection Accuracy ◽

Remote Sensing Images

Recently, methods based on Faster region-based convolutional neural network (R-CNN)have been popular in multi-class object detection in remote sensing images due to their outstandingdetection performance. The methods generally propose candidate region of interests (ROIs) througha region propose network (RPN), and the regions with high enough intersection-over-union (IoU)values against ground truth are treated as positive samples for training. In this paper, we find thatthe detection result of such methods is sensitive to the adaption of different IoU thresholds. Specially,detection performance of small objects is poor when choosing a normal higher threshold, while alower threshold will result in poor location accuracy caused by a large quantity of false positives.To address the above issues, we propose a novel IoU-Adaptive Deformable R-CNN framework formulti-class object detection. Specially, by analyzing the different roles that IoU can play in differentparts of the network, we propose an IoU-guided detection framework to reduce the loss of small objectinformation during training. Besides, the IoU-based weighted loss is designed, which can learn theIoU information of positive ROIs to improve the detection accuracy effectively. Finally, the class aspectratio constrained non-maximum suppression (CARC-NMS) is proposed, which further improves theprecision of the results. Extensive experiments validate the effectiveness of our approach and weachieve state-of-the-art detection performance on the DOTA dataset.

Download Full-text

ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6945 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12557-12564 ◽

Cited By ~ 4

Author(s):

Zhenbo Xu ◽

Wei Zhang ◽

Xiaoqing Ye ◽

Xiao Tan ◽

Wei Yang ◽

...

Keyword(s):

Object Detection ◽

Point Clouds ◽

Autonomous Driving ◽

Disparity Estimation ◽

3D Object ◽

Detection Model ◽

Occluded Objects ◽

Bounding Boxes ◽

Detection Quality ◽

3D Object Detection

3D object detection is an essential task in autonomous driving and robotics. Though great progress has been made, challenges remain in estimating 3D pose for distant and occluded objects. In this paper, we present a novel framework named ZoomNet for stereo imagery-based 3D detection. The pipeline of ZoomNet begins with an ordinary 2D object detection model which is used to obtain pairs of left-right bounding boxes. To further exploit the abundant texture cues in rgb images for more accurate disparity estimation, we introduce a conceptually straight-forward module – adaptive zooming, which simultaneously resizes 2D instance bounding boxes to a unified resolution and adjusts the camera intrinsic parameters accordingly. In this way, we are able to estimate higher-quality disparity maps from the resized box images then construct dense point clouds for both nearby and distant objects. Moreover, we introduce to learn part locations as complementary features to improve the resistance against occlusion and put forward the 3D fitting score to better estimate the 3D detection quality. Extensive experiments on the popular KITTI 3D detection dataset indicate ZoomNet surpasses all previous state-of-the-art methods by large margins (improved by 9.4% on APbv (IoU=0.7) over pseudo-LiDAR). Ablation study also demonstrates that our adaptive zooming strategy brings an improvement of over 10% on AP3d (IoU=0.7). In addition, since the official KITTI benchmark lacks fine-grained annotations like pixel-wise part locations, we also present our KFG dataset by augmenting KITTI with detailed instance-wise annotations including pixel-wise part location, pixel-wise disparity, etc.. Both the KFG dataset and our codes will be publicly available at https://github.com/detectRecog/ZoomNet.

Download Full-text

A Comparative Analysis of Object Detection Metrics with a Companion Open-Source Toolkit

Electronics ◽

10.3390/electronics10030279 ◽

2021 ◽

Vol 10 (3) ◽

pp. 279

Author(s):

Rafael Padilla ◽

Wesley L. Passos ◽

Thadeu L. B. Dias ◽

Sergio L. Netto ◽

Eduardo A. B. da Silva

Keyword(s):

Comparative Analysis ◽

Object Detection ◽

Open Source ◽

Performance Metrics ◽

Ground Truth ◽

Detection Methods ◽

Detection Algorithms ◽

Spatio Temporal ◽

Bounding Boxes ◽

Temporal Overlap

Recent outstanding results of supervised object detection in competitions and challenges are often associated with specific metrics and datasets. The evaluation of such methods applied in different contexts have increased the demand for annotated datasets. Annotation tools represent the location and size of objects in distinct formats, leading to a lack of consensus on the representation. Such a scenario often complicates the comparison of object detection methods. This work alleviates this problem along the following lines: (i) It provides an overview of the most relevant evaluation methods used in object detection competitions, highlighting their peculiarities, differences, and advantages; (ii) it examines the most used annotation formats, showing how different implementations may influence the assessment results; and (iii) it provides a novel open-source toolkit supporting different annotation formats and 15 performance metrics, making it easy for researchers to evaluate the performance of their detection algorithms in most known datasets. In addition, this work proposes a new metric, also included in the toolkit, for evaluating object detection in videos that is based on the spatio-temporal overlap between the ground-truth and detected bounding boxes.

Download Full-text

MADRaS : Multi Agent Driving Simulator

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.12531 ◽

2021 ◽

Vol 70 ◽

pp. 1517-1555

Author(s):

Anirban Santara ◽

Sohan Rudra ◽

Sree Aditya Buridi ◽

Meha Kaushik ◽

Abhishek Naik ◽

...

Keyword(s):

Reinforcement Learning ◽

Motion Planning ◽

Open Source ◽

Real World ◽

Driving Simulator ◽

Autonomous Driving ◽

Machine Learning Algorithms ◽

Control Mode ◽

Simulation Engine ◽

Multi Agent

Autonomous driving has emerged as one of the most active areas of research as it has the promise of making transportation safer and more efficient than ever before. Most real-world autonomous driving pipelines perform perception, motion planning and action in a loop. In this work we present MADRaS, an open-source multi-agent driving simulator for use in the design and evaluation of motion planning algorithms for autonomous driving. Given a start and a goal state, the task of motion planning is to solve for a sequence of position, orientation and speed values in order to navigate between the states while adhering to safety constraints. These constraints often involve the behaviors of other agents in the environment. MADRaS provides a platform for constructing a wide variety of highway and track driving scenarios where multiple driving agents can be trained for motion planning tasks using reinforcement learning and other machine learning algorithms. MADRaS is built on TORCS, an open-source car-racing simulator. TORCS offers a variety of cars with different dynamic properties and driving tracks with different geometries and surface. MADRaS inherits these functionalities from TORCS and introduces support for multi-agent training, inter-vehicular communication, noisy observations, stochastic actions, and custom traffic cars whose behaviors can be programmed to simulate challenging traffic conditions encountered in the real world. MADRaS can be used to create driving tasks whose complexities can be tuned along eight axes in well-defined steps. This makes it particularly suited for curriculum and continual learning. MADRaS is lightweight and it provides a convenient OpenAI Gym interface for independent control of each car. Apart from the primitive steering-acceleration-brake control mode of TORCS, MADRaS offers a hierarchical track-position – speed control mode that can potentially be used to achieve better generalization. MADRaS uses a UDP based client server model where the simulation engine is the server and each client is a driving agent. MADRaS uses multiprocessing to run each agent as a parallel process for efficiency and integrates well with popular reinforcement learning libraries like RLLib. We show experiments on single and multi-agent reinforcement learning with and without curriculum

Download Full-text