monocular depth
Recently Published Documents


TOTAL DOCUMENTS

470
(FIVE YEARS 342)

H-INDEX

26
(FIVE YEARS 12)

2022 ◽  
Author(s):  
Xiao Lu ◽  
Haoran Sun ◽  
Xiuling Wang ◽  
Zhiguo Zhang ◽  
Haixia Wang

2022 ◽  
pp. 1-1
Author(s):  
Jipeng Wu ◽  
Rongrong Ji ◽  
Qiang Wang ◽  
Shengchuan Zhang ◽  
Xiaoshuai Sun ◽  
...  

SoftwareX ◽  
2022 ◽  
Vol 17 ◽  
pp. 100956
Author(s):  
Kirill Muravyev ◽  
Andrey Bokovoy ◽  
Konstantin Yakovlev

Electronics ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 76
Author(s):  
Jongsub Yu ◽  
Hyukdoo Choi

This paper presents an object detector with depth estimation using monocular camera images. Previous detection studies have typically focused on detecting objects with 2D or 3D bounding boxes. A 3D bounding box consists of the center point, its size parameters, and heading information. However, predicting complex output compositions leads a model to have generally low performances, and it is not necessary for risk assessment for autonomous driving. We focused on predicting a single depth per object, which is essential for risk assessment for autonomous driving. Our network architecture is based on YOLO v4, which is a fast and accurate one-stage object detector. We added an additional channel to the output layer for depth estimation. To train depth prediction, we extract the closest depth from the 3D bounding box coordinates of ground truth labels in the dataset. Our model is compared with the latest studies on 3D object detection using the KITTI object detection benchmark. As a result, we show that our model achieves higher detection performance and detection speed than existing models with comparable depth accuracy.


Electronics ◽  
2021 ◽  
Vol 10 (24) ◽  
pp. 3153
Author(s):  
Shouying Wu ◽  
Wei Li ◽  
Binbin Liang ◽  
Guoxin Huang

The self-supervised monocular depth estimation paradigm has become an important branch of computer vision depth-estimation tasks. However, the depth estimation problem arising from object edge depth pulling or occlusion is still unsolved. The grayscale discontinuity of object edges leads to a relatively high depth uncertainty of pixels in these regions. We improve the geometric edge prediction results by taking uncertainty into account in the depth-estimation task. To this end, we explore how uncertainty affects this task and propose a new self-supervised monocular depth estimation technique based on multi-scale uncertainty. In addition, we introduce a teacher–student architecture in models and investigate the impact of different teacher networks on the depth and uncertainty results. We evaluate the performance of our paradigm in detail on the standard KITTI dataset. The experimental results show that the accuracy of our method increased from 87.7% to 88.2%, the AbsRel error rate decreased from 0.115 to 0.11, the SqRel error rate decreased from 0.903 to 0.822, and the RMSE error rate decreased from 4.863 to 4.686 compared with the benchmark Monodepth2. Our approach has a positive impact on the problem of texture replication or inaccurate object boundaries, producing sharper and smoother depth images.


2021 ◽  
Author(s):  
Feng Wei ◽  
XingHui Yin ◽  
Jie Shen ◽  
HuiBin Wang

Abstract With the development of depth learning, the accuracy and effect of the algorithm applied to monocular depth estimation have been greatly improved, but the existing algorithms need a lot of computing resources. At present, how to apply the existing algorithms to UAV and its small robot is an urgent need.Based on full convolution neural network and Kitti dataset, this paper uses deep separable convolution to optimize the network architecture, reduce training parameters and improve computing speed. Experimental results show that our method is very effective and has a certain reference value in the development direction of monocular depth estimation algorithm.


2021 ◽  
Author(s):  
Zhimin Zhang ◽  
◽  
Jianzhong Qiao ◽  
Shukuan Lin ◽  
◽  
...  

The depth and pose information are the basic issues in the field of robotics, autonomous driving, and virtual reality, and are also the focus and difficult issues of computer vision research. The supervised monocular depth and pose estimation learning are not feasible in environments where labeled data is not abundant. Self-supervised monocular video methods can learn effectively only by applying photometric constraints without expensive ground true depth label constraints, which results in an inefficient training process and suboptimal estimation accuracy. To solve these problems, a monocular weakly supervised depth and pose estimation method based on multi-information fusion is proposed in this paper. First, we design a high-precision stereo matching method to generate a depth and pose data as the "Ground Truth" labels to solve the problem that the ground truth labels are difficult to obtain. Then, we construct a multi-information fusion network model based on the "Ground truth" labels, video sequence, and IMU information to improve the estimation accuracy. Finally, we design the loss function of supervised cues based on "Ground Truth" labels cues and self-supervised cues to optimize our model. In the testing phase, the network model can separately output high-precision depth and pose data from a monocular video sequence. The resulting model outperforms mainstream monocular depth and poses estimation methods as well as the partial stereo matching method in the challenging KITTI dataset by only using a small number of real training data(200 pairs).


Sign in / Sign up

Export Citation Format

Share Document