scholarly journals Determination of Vehicle Trajectory through Optimization of Vehicle Bounding Boxes Using a Convolutional Neural Network

Sensors ◽  
2019 ◽  
Vol 19 (19) ◽  
pp. 4263 ◽  
Author(s):  
Seong ◽  
Song ◽  
Yoon ◽  
Kim ◽  
Choi

In this manuscript, a new method for the determination of vehicle trajectories using an optimal bounding box for the vehicle is developed. The vehicle trajectory is extracted using images acquired from a camera installed at an intersection based on a convolutional neural network (CNN). First, real-time vehicle object detection is performed using the YOLOv2 model, which is one of the most representative object detection algorithms based on CNN. To overcome the inaccuracy of the vehicle location extracted by YOLOv2, the trajectory was calibrated using a vehicle tracking algorithm such as a Kalman filter and intersection-over-union (IOU) tracker. In particular, we attempted to correct the vehicle trajectory by extracting the center position based on the geometric characteristics of a moving vehicle according to the bounding box. The quantitative and qualitative evaluations indicate that the proposed algorithm can detect the trajectories of moving vehicles better than the conventional algorithm. Although the center points of the bounding boxes obtained using the existing conventional algorithm are often outside of the vehicle due to the geometric displacement of the camera, the proposed technique can minimize positional errors and extract the optimal bounding box to determine the vehicle location.

Author(s):  
P.L. Nikolaev

This article deals with method of binary classification of images with small text on them Classification is based on the fact that the text can have 2 directions – it can be positioned horizontally and read from left to right or it can be turned 180 degrees so the image must be rotated to read the sign. This type of text can be found on the covers of a variety of books, so in case of recognizing the covers, it is necessary first to determine the direction of the text before we will directly recognize it. The article suggests the development of a deep neural network for determination of the text position in the context of book covers recognizing. The results of training and testing of a convolutional neural network on synthetic data as well as the examples of the network functioning on the real data are presented.


2020 ◽  
Author(s):  
CSN Koushik ◽  
Shruti Bhargava Choubey ◽  
Abhishek Choubey ◽  
D. Naresh ◽  
N. Bhanu Prakash Reddy

2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199332
Author(s):  
Xintao Ding ◽  
Boquan Li ◽  
Jinbao Wang

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.


Sensors ◽  
2021 ◽  
Vol 21 (9) ◽  
pp. 2939
Author(s):  
Yong Hong ◽  
Jin Liu ◽  
Zahid Jahangir ◽  
Sheng He ◽  
Qing Zhang

This paper provides an efficient way of addressing the problem of detecting or estimating the 6-Dimensional (6D) pose of objects from an RGB image. A quaternion is used to define an object′s three-dimensional pose, but the pose represented by q and the pose represented by -q are equivalent, and the L2 loss between them is very large. Therefore, we define a new quaternion pose loss function to solve this problem. Based on this, we designed a new convolutional neural network named Q-Net to estimate an object’s pose. Considering that the quaternion′s output is a unit vector, a normalization layer is added in Q-Net to hold the output of pose on a four-dimensional unit sphere. We propose a new algorithm, called the Bounding Box Equation, to obtain 3D translation quickly and effectively from 2D bounding boxes. The algorithm uses an entirely new way of assessing the 3D rotation (R) and 3D translation rotation (t) in only one RGB image. This method can upgrade any traditional 2D-box prediction algorithm to a 3D prediction model. We evaluated our model using the LineMod dataset, and experiments have shown that our methodology is more acceptable and efficient in terms of L2 loss and computational time.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1737
Author(s):  
Wooseop Lee ◽  
Min-Hee Kang ◽  
Jaein Song ◽  
Keeyeon Hwang

As automated vehicles have been considered one of the important trends in intelligent transportation systems, various research is being conducted to enhance their safety. In particular, the importance of technologies for the design of preventive automated driving systems, such as detection of surrounding objects and estimation of distance between vehicles. Object detection is mainly performed through cameras and LiDAR, but due to the cost and limits of LiDAR’s recognition distance, the need to improve Camera recognition technique, which is relatively convenient for commercialization, is increasing. This study learned convolutional neural network (CNN)-based faster regions with CNN (Faster R-CNN) and You Only Look Once (YOLO) V2 to improve the recognition techniques of vehicle-mounted monocular cameras for the design of preventive automated driving systems, recognizing surrounding vehicles in black box highway driving videos and estimating distances from surrounding vehicles through more suitable models for automated driving systems. Moreover, we learned the PASCAL visual object classes (VOC) dataset for model comparison. Faster R-CNN showed similar accuracy, with a mean average precision (mAP) of 76.4 to YOLO with a mAP of 78.6, but with a Frame Per Second (FPS) of 5, showing slower processing speed than YOLO V2 with an FPS of 40, and a Faster R-CNN, which we had difficulty detecting. As a result, YOLO V2, which shows better performance in accuracy and processing speed, was determined to be a more suitable model for automated driving systems, further progressing in estimating the distance between vehicles. For distance estimation, we conducted coordinate value conversion through camera calibration and perspective transform, set the threshold to 0.7, and performed object detection and distance estimation, showing more than 80% accuracy for near-distance vehicles. Through this study, it is believed that it will be able to help prevent accidents in automated vehicles, and it is expected that additional research will provide various accident prevention alternatives such as calculating and securing appropriate safety distances, depending on the vehicle types.


2017 ◽  
Vol 24 (5) ◽  
pp. 1073-1081 ◽  
Author(s):  
Ken Chang ◽  
Harrison X. Bai ◽  
Hao Zhou ◽  
Chang Su ◽  
Wenya Linda Bi ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document