scholarly journals Real-time Estimation of Road Surfaces using Fast Monocular Depth Estimation and Normal Vector Clustering

2021 ◽  
Vol 5 (3) ◽  
pp. 206
Author(s):  
Chuho Yi ◽  
Jungwon Cho

Estimating a road surface or planes for applying AR(Augmented Reality) or an autonomous vehicle using a camera requires significant computation. Vision sensors have lower accuracy in distance measurement than other types of sensor, and have the difficulty that additional algorithms for estimating data must be included. However, using a camera has the advantage of being able to extract various information such as weather conditions, sign information, and road markings that are difficult to measure with other sensors. Various methods differing in sensor type and configuration have been applied. Many of the existing studies had generally researched by performing the depth estimation after the feature extraction. However, recent studies have suggested using deep learning to skip multiple processes and use a single DNN(Deep Neural Network). Also, a method using a limited single camera instead of a method using a plurality of sensors has been proposed. This paper presents a single-camera method that performs quickly and efficiently by employing a DNN to extract distance information using a single camera, and proposes a modified method for using a depth map to obtain real-time surface characteristics. First, a DNN is used to estimate the depth map, and then for quick operation, normal vector that can connect similar planes to depth is calculated, and a clustering method that can be connected is provided. An experiment is used to show the validity of our method, and to evaluate the calculation time.

Electronics ◽  
2020 ◽  
Vol 9 (3) ◽  
pp. 451 ◽  
Author(s):  
Limin Guan ◽  
Yi Chen ◽  
Guiping Wang ◽  
Xu Lei

Vehicle detection is essential for driverless systems. However, the current single sensor detection mode is no longer sufficient in complex and changing traffic environments. Therefore, this paper combines camera and light detection and ranging (LiDAR) to build a vehicle-detection framework that has the characteristics of multi adaptability, high real-time capacity, and robustness. First, a multi-adaptive high-precision depth-completion method was proposed to convert the 2D LiDAR sparse depth map into a dense depth map, so that the two sensors are aligned with each other at the data level. Then, the You Only Look Once Version 3 (YOLOv3) real-time object detection model was used to detect the color image and the dense depth map. Finally, a decision-level fusion method based on bounding box fusion and improved Dempster–Shafer (D–S) evidence theory was proposed to merge the two results of the previous step and obtain the final vehicle position and distance information, which not only improves the detection accuracy but also improves the robustness of the whole framework. We evaluated our method using the KITTI dataset and the Waymo Open Dataset, and the results show the effectiveness of the proposed depth completion method and multi-sensor fusion strategy.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4434 ◽  
Author(s):  
Sangwon Kim ◽  
Jaeyeal Nam ◽  
Byoungchul Ko

Depth estimation is a crucial and fundamental problem in the computer vision field. Conventional methods re-construct scenes using feature points extracted from multiple images; however, these approaches require multiple images and thus are not easily implemented in various real-time applications. Moreover, the special equipment required by hardware-based approaches using 3D sensors is expensive. Therefore, software-based methods for estimating depth from a single image using machine learning or deep learning are emerging as new alternatives. In this paper, we propose an algorithm that generates a depth map in real time using a single image and an optimized lightweight efficient neural network (L-ENet) algorithm instead of physical equipment, such as an infrared sensor or multi-view camera. Because depth values have a continuous nature and can produce locally ambiguous results, pixel-wise prediction with ordinal depth range classification was applied in this study. In addition, in our method various convolution techniques are applied to extract a dense feature map, and the number of parameters is greatly reduced by reducing the network layer. By using the proposed L-ENet algorithm, an accurate depth map can be generated from a single image quickly and, in a comparison with the ground truth, we can produce depth values closer to those of the ground truth with small errors. Experiments confirmed that the proposed L-ENet can achieve a significantly improved estimation performance over the state-of-the-art algorithms in depth estimation based on a single image.


Author(s):  
C. K. Toth ◽  
Z. Koppanyi ◽  
M. G. Lenzano

<p><strong>Abstract.</strong> The ongoing proliferation of remote sensing technologies in the consumer market has been rapidly reshaping the geospatial data acquisition world, and subsequently, the data processing as well as information dissemination processes. Smartphones have clearly established themselves as the primary crowdsourced data generators recently, and provide an incredible volume of remote sensed data with fairly good georeferencing. Besides the potential to map the environment of the smartphone users, they provide information to monitor the dynamic content of the object space. For example, real-time traffic monitoring is one of the most known and widely used real-time crowdsensed application, where the smartphones in vehicles jointly contribute to an unprecedentedly accurate traffic flow estimation. Now we are witnessing another milestone to happen, as driverless vehicle technologies will become another major source of crowdsensed data. Due to safety concerns, the requirements for sensing are higher, as the vehicles should sense other vehicles and the road infrastructure under any condition, not just daylight in favorable weather conditions, and at very fast speed. Furthermore, the sensing is based on using redundant and complementary sensor streams to achieve a robust object space reconstruction, needed to avoid collisions and maintain normal travel patterns. At this point, the remote sensed data in assisted and autonomous vehicles are discarded, or partially recorded for R&amp;amp;D purposes. However, in the long run, as vehicle-to-vehicle (V2V) and vehicle-to-infrastructure (V2I) communication technologies mature, recording data will become a common place, and will provide an excellent source of geospatial information for road mapping, traffic monitoring, etc. This paper reviews the key characteristics of crowdsourced vehicle data based on experimental data, and then the processing aspects, including the Data Science and Deep Learning components.</p>


Author(s):  
Babing Ji ◽  
Qixin Cao

Purpose This paper aims to propose a new solution for real-time 3D perception with monocular camera. Most of the industrial robots’ solutions use active sensors to acquire 3D structure information, which limit their applications to indoor scenarios. By only using monocular camera, some state of art method provides up-to-scale 3D structure information, but scale information of corresponding objects is uncertain. Design/methodology/approach First, high-accuracy and scale-informed camera pose and sparse 3D map are provided by leveraging ORB-SLAM and marker. Second, for each frame captured by a camera, a specially designed depth estimation pipeline is used to compute corresponding 3D structure called depth map in real-time. Finally, depth map is integrated into volumetric scene model. A feedback module has been designed for users to visualize intermediate scene surface in real-time. Findings The system provides more robust tracking performance and compelling results. The implementation runs near 25 Hz on mainstream laptop based on parallel computation technique. Originality/value A new solution for 3D perception is using monocular camera by leveraging ORB-SLAM systems. Results in our system are visually comparable to active sensor systems such as elastic fusion in small scenes. The system is also both efficient and easy to implement, and algorithms and specific configurations involved are introduced in detail.


2020 ◽  
Vol 6 (1) ◽  
pp. 1-11
Author(s):  
DMS Zaman ◽  
Md Hasan Maruf ◽  
Md Ashiqur Rahman ◽  
Jannatul Ferdousy ◽  
ASM Shihavuddin

Real time estimation of nutrition intake from regular food items using mobile-based applications could be a breakthrough in creating public awareness of threats in overeating or faulty food choices. The bottleneck in implementing such systems is to effectively estimate the depths of the food items which is essential to calculate the volumes of foods. Volumes and density of food items can be used to estimate the weights of food eaten and their corresponding nutrition contents. Without specific depth sensors, it is very difficult to estimate the depth of any object from a single camera. Such sensors are equipped only in very advanced and expensive mobile devices. This work investigates the possibilities of using regular cameras to calculate the same using a specific frame structure. We proposed a controlled camera setup to acquire overlapping images of the food from different positions already calibrated to estimate the depths. The results were compared with the Kinect device’s depth measures to show the efficiency of the proposed method. We further investigated the optimum number of camera positions, their corresponding angles, and distances from the object to propose the best configuration for such a controlled system of image acquisition with regular mobile cameras. Overall the proposed method presents a low-cost solution to the depth estimation problem and opens up the possibilities for mobile-based apps for dietary assessment for various health-related problem-solving. GUB JOURNAL OF SCIENCE AND ENGINEERING, Vol 6(1), Dec 2019 P 1-11


2021 ◽  
Author(s):  
Yupeng Xie ◽  
Sarah Fachada ◽  
Daniele Bonatto ◽  
Mehrdad Teratani ◽  
Gauthier Lafruit

Depth-Image-Based Rendering (DIBR) can synthesize a virtual view image from a set of multiview images and corresponding depth maps. However, this requires an accurate depth map estimation that incurs a high compu- tational cost over several minutes per frame in DERS (MPEG-I’s Depth Estimation Reference Software) even by using a high-class computer. LiDAR cameras can thus be an alternative solution to DERS in real-time DIBR ap- plications. We compare the quality of a low-cost LiDAR camera, the Intel Realsense LiDAR L515 calibrated and configured adequately, with DERS using MPEG-I’s Reference View Synthesizer (RVS). In IV-PSNR, the LiDAR camera reaches 32.2dB view synthesis quality with a 15cm camera baseline and 40.3dB with a 2cm baseline. Though DERS outperforms the LiDAR camera with 4.2dB, the latter provides a better quality-performance trade- off. However, visual inspection demonstrates that LiDAR’s virtual views have even slightly higher quality than with DERS in most tested low-texture scene areas, except for object borders. Overall, we highly recommend using LiDAR cameras over advanced depth estimation methods (like DERS) in real-time DIBR applications. Neverthe- less, this requires delicate calibration with multiple tools further exposed in the paper.


Author(s):  
Hyeji Kim ◽  
Jinyeon Lim ◽  
Yeongmin Lee ◽  
Woojin Yun ◽  
Young-Gyu Kim ◽  
...  

2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
Zhiwei Tang ◽  
Bin Li ◽  
Huosheng Li ◽  
Zheng Xu

Depth estimation becomes the key technology to resolve the communications of the stereo vision. We can get the real-time depth map based on hardware, which cannot implement complicated algorithm as software, because there are some restrictions in the hardware structure. Eventually, some wrong stereo matching will inevitably exist in the process of depth estimation by hardware, such as FPGA. In order to solve the problem a postprocessing function is designed in this paper. After matching cost unique test, the both left-right and right-left consistency check solutions are implemented, respectively; then, the cavities in depth maps can be filled by right depth values on the basis of right-left consistency check solution. The results in the experiments have shown that the depth map extraction and postprocessing function can be implemented in real time in the same system; what is more, the quality of the depth maps is satisfactory.


2021 ◽  
Vol 22 (4) ◽  
pp. 461-470
Author(s):  
Jozsef Suto

Abstract Autonomous navigation is important not only in autonomous cars but also in other transportation systems. In many applications, an autonomous vehicle has to follow the curvature of a real or artificial road or in other words lane lines. In those application, the key is the lane detection. In this paper, we present a real-time lane line tracking algorithm mainly designed to mini vehicles with relatively low computation capacity and single camera sensor. The proposed algorithm exploits computer vision techniques in combination with digital filtering. To demonstrate the performance of the method, experiments are conducted in an indoor, self-made test track where the effect of several external influencing factors can be observed. Experimental results show that the proposed algorithm works well independently of shadows, bends, reflection and lighting changes.


Sign in / Sign up

Export Citation Format

Share Document