SUW-Learn: Joint Supervised, Unsupervised, Weakly Supervised Deep Learning for Monocular Depth Estimation

Author(s):  
Haoyu Ren ◽  
Aman Raj ◽  
Mostafa El-Khamy ◽  
Jungwon Lee
Author(s):  
L. Madhuanand ◽  
F. Nex ◽  
M. Y. Yang

Abstract. Depth is an essential component for various scene understanding tasks and for reconstructing the 3D geometry of the scene. Estimating depth from stereo images requires multiple views of the same scene to be captured which is often not possible when exploring new environments with a UAV. To overcome this monocular depth estimation has been a topic of interest with the recent advancements in computer vision and deep learning techniques. This research has been widely focused on indoor scenes or outdoor scenes captured at ground level. Single image depth estimation from aerial images has been limited due to additional complexities arising from increased camera distance, wider area coverage with lots of occlusions. A new aerial image dataset is prepared specifically for this purpose combining Unmanned Aerial Vehicles (UAV) images covering different regions, features and point of views. The single image depth estimation is based on image reconstruction techniques which uses stereo images for learning to estimate depth from single images. Among the various available models for ground-level single image depth estimation, two models, 1) a Convolutional Neural Network (CNN) and 2) a Generative Adversarial model (GAN) are used to learn depth from aerial images from UAVs. These models generate pixel-wise disparity images which could be converted into depth information. The generated disparity maps from these models are evaluated for its internal quality using various error metrics. The results show higher disparity ranges with smoother images generated by CNN model and sharper images with lesser disparity range generated by GAN model. The produced disparity images are converted to depth information and compared with point clouds obtained using Pix4D. It is found that the CNN model performs better than GAN and produces depth similar to that of Pix4D. This comparison helps in streamlining the efforts to produce depth from a single aerial image.


Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2272 ◽  
Author(s):  
Faisal Khan ◽  
Saqib Salahuddin ◽  
Hossein Javidnia

Monocular depth estimation from Red-Green-Blue (RGB) images is a well-studied ill-posed problem in computer vision which has been investigated intensively over the past decade using Deep Learning (DL) approaches. The recent approaches for monocular depth estimation mostly rely on Convolutional Neural Networks (CNN). Estimating depth from two-dimensional images plays an important role in various applications including scene reconstruction, 3D object-detection, robotics and autonomous driving. This survey provides a comprehensive overview of this research topic including the problem representation and a short description of traditional methods for depth estimation. Relevant datasets and 13 state-of-the-art deep learning-based approaches for monocular depth estimation are reviewed, evaluated and discussed. We conclude this paper with a perspective towards future research work requiring further investigation in monocular depth estimation challenges.


2020 ◽  
Vol 2020 (14) ◽  
pp. 377-1-377-7
Author(s):  
Bruno Artacho ◽  
Nilesh Pandey ◽  
Andreas Savakis

Monocular depth estimation is an important task in scene understanding with applications to pose, segmentation and autonomous navigation. Deep Learning methods relying on multilevel features are currently used for extracting local information that is used to infer depth from a single RGB image. We present an efficient architecture that utilizes the features from multiple levels with fewer connections compared to previous networks. Our model achieves comparable scores for monocular depth estimation with better efficiency on the memory requirements and computational burden.


2020 ◽  
Vol 63 (9) ◽  
pp. 1612-1627
Author(s):  
ChaoQiang Zhao ◽  
QiYu Sun ◽  
ChongZhen Zhang ◽  
Yang Tang ◽  
Feng Qian

Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1437
Author(s):  
Ryota Yoneyama ◽  
Angel J. Duran ◽  
Angel P. del Pobil

Deep learning is the mainstream paradigm in computer vision and machine learning, but performance is usually not as good as expected when used for applications in robot vision. The problem is that robot sensing is inherently active, and often, relevant data is scarce for many application domains. This calls for novel deep learning approaches that can offer a good performance at a lower data consumption cost. We address here monocular depth estimation in warehouse automation with new methods and three different deep architectures. Our results suggest that the incorporation of sensor models and prior knowledge relative to robotic active vision, can consistently improve the results and learning performance from fewer than usual training samples, as compared to standard data-driven deep learning.


2021 ◽  
Vol 136 ◽  
pp. 103701
Author(s):  
Raul de Queiroz Mendes ◽  
Eduardo Godinho Ribeiro ◽  
Nicolas dos Santos Rosa ◽  
Valdir Grassi

Sign in / Sign up

Export Citation Format

Share Document