DEEP LEARNING FOR MONOCULAR DEPTH ESTIMATION FROM UAV IMAGES

Abstract. Depth is an essential component for various scene understanding tasks and for reconstructing the 3D geometry of the scene. Estimating depth from stereo images requires multiple views of the same scene to be captured which is often not possible when exploring new environments with a UAV. To overcome this monocular depth estimation has been a topic of interest with the recent advancements in computer vision and deep learning techniques. This research has been widely focused on indoor scenes or outdoor scenes captured at ground level. Single image depth estimation from aerial images has been limited due to additional complexities arising from increased camera distance, wider area coverage with lots of occlusions. A new aerial image dataset is prepared specifically for this purpose combining Unmanned Aerial Vehicles (UAV) images covering different regions, features and point of views. The single image depth estimation is based on image reconstruction techniques which uses stereo images for learning to estimate depth from single images. Among the various available models for ground-level single image depth estimation, two models, 1) a Convolutional Neural Network (CNN) and 2) a Generative Adversarial model (GAN) are used to learn depth from aerial images from UAVs. These models generate pixel-wise disparity images which could be converted into depth information. The generated disparity maps from these models are evaluated for its internal quality using various error metrics. The results show higher disparity ranges with smoother images generated by CNN model and sharper images with lesser disparity range generated by GAN model. The produced disparity images are converted to depth information and compared with point clouds obtained using Pix4D. It is found that the CNN model performs better than GAN and produces depth similar to that of Pix4D. This comparison helps in streamlining the efforts to produce depth from a single aerial image.

Download Full-text

Efficient Multilevel Architecture for Depth Estimation from a Single Image

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.14.coimg-377 ◽

2020 ◽

Vol 2020 (14) ◽

pp. 377-1-377-7

Author(s):

Bruno Artacho ◽

Nilesh Pandey ◽

Andreas Savakis

Keyword(s):

Deep Learning ◽

Autonomous Navigation ◽

Depth Estimation ◽

Local Information ◽

Single Image ◽

Learning Methods ◽

Computational Burden ◽

Multiple Levels ◽

Monocular Depth ◽

Rgb Image

Monocular depth estimation is an important task in scene understanding with applications to pose, segmentation and autonomous navigation. Deep Learning methods relying on multilevel features are currently used for extracting local information that is used to infer depth from a single RGB image. We present an efficient architecture that utilizes the features from multiple levels with fewer connections compared to previous networks. Our model achieves comparable scores for monocular depth estimation with better efficiency on the memory requirements and computational burden.

Download Full-text

A Novel 3D-Unet Deep Learning Framework Based on High-Dimensional Bilateral Grid for Edge Consistent Single Image Depth Estimation

2020 International Conference on 3D Immersion (IC3D) ◽

10.1109/ic3d51119.2020.9376327 ◽

2020 ◽

Author(s):

Mansi Sharma ◽

Abheesht Sharma ◽

Kadvekar Rohit Tushar ◽

Avinash Panneer

Keyword(s):

Deep Learning ◽

Depth Estimation ◽

High Dimensional ◽

Single Image ◽

Learning Framework ◽

Image Depth

Download Full-text

Monocular Depth Estimation using Transfer learning-An Overview

E3S Web of Conferences ◽

10.1051/e3sconf/202130901069 ◽

2021 ◽

Vol 309 ◽

pp. 01069

Author(s):

K. Swaraja ◽

V. Akshitha ◽

K. Pranav ◽

B. Vyshnavi ◽

V. Sai Akhil ◽

...

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Deep Neural Networks ◽

Depth Estimation ◽

Depth Information ◽

Learning Approaches ◽

Learning Network ◽

Depth Maps ◽

Ill Posed ◽

Monocular Depth

Depth estimation is a computer vision technique that is critical for autonomous schemes for sensing their surroundings and predict their own condition. Traditional estimating approaches, such as structure from motion besides stereo vision similarity, rely on feature communications from several views to provide depth information. In the meantime, the depth maps anticipated are scarce. Gathering depth information via monocular depth estimation is an ill-posed issue, according to a substantial corpus of deep learning approaches recently suggested. Estimation of Monocular depth with deep learning has gotten a lot of interest in current years, thanks to the fast expansion of deep neural networks, and numerous strategies have been developed to solve this issue. In this study, we want to give a comprehensive assessment of the methodologies often used in the estimation of monocular depth. The purpose of this study is to look at recent advances in deep learning-based estimation of monocular depth. To begin, we'll go through the various depth estimation techniques and datasets for monocular depth estimation. A complete overview of multiple deep learning methods that use transfer learning Network designs, including several combinations of encoders and decoders, is offered. In addition, multiple deep learning-based monocular depth estimation approaches and models are classified. Finally, the use of transfer learning approaches to monocular depth estimation is illustrated.

Download Full-text

Monocular Depth Estimation with Joint Attention Feature Distillation and Wavelet-Based Loss Function

Sensors ◽

10.3390/s21010054 ◽

2020 ◽

Vol 21 (1) ◽

pp. 54

Author(s):

Peng Liu ◽

Zonghua Zhang ◽

Zhaozong Meng ◽

Nan Gao

Keyword(s):

Joint Attention ◽

Loss Function ◽

Depth Estimation ◽

Depth Information ◽

3D Vision ◽

Network Training ◽

Crucial Component ◽

Benchmark Datasets ◽

Ill Posed ◽

Monocular Depth

Depth estimation is a crucial component in many 3D vision applications. Monocular depth estimation is gaining increasing interest due to flexible use and extremely low system requirements, but inherently ill-posed and ambiguous characteristics still cause unsatisfactory estimation results. This paper proposes a new deep convolutional neural network for monocular depth estimation. The network applies joint attention feature distillation and wavelet-based loss function to recover the depth information of a scene. Two improvements were achieved, compared with previous methods. First, we combined feature distillation and joint attention mechanisms to boost feature modulation discrimination. The network extracts hierarchical features using a progressive feature distillation and refinement strategy and aggregates features using a joint attention operation. Second, we adopted a wavelet-based loss function for network training, which improves loss function effectiveness by obtaining more structural details. The experimental results on challenging indoor and outdoor benchmark datasets verified the proposed method’s superiority compared with current state-of-the-art methods.

Download Full-text

PDANet: Self-Supervised Monocular Depth Estimation Using Perceptual and Data Augmentation Consistency

Applied Sciences ◽

10.3390/app11125383 ◽

2021 ◽

Vol 11 (12) ◽

pp. 5383

Author(s):

Huachen Gao ◽

Xiaoyu Liu ◽

Meixia Qu ◽

Shijie Huang

Keyword(s):

Data Augmentation ◽

State Of The Art ◽

Depth Estimation ◽

Input Image ◽

Depth Information ◽

Disparity Map ◽

Estimation Model ◽

Absolute Relative Error ◽

Texture Region ◽

Monocular Depth

In recent studies, self-supervised learning methods have been explored for monocular depth estimation. They minimize the reconstruction loss of images instead of depth information as a supervised signal. However, existing methods usually assume that the corresponding points in different views should have the same color, which leads to unreliable unsupervised signals and ultimately damages the reconstruction loss during the training. Meanwhile, in the low texture region, it is unable to predict the disparity value of pixels correctly because of the small number of extracted features. To solve the above issues, we propose a network—PDANet—that integrates perceptual consistency and data augmentation consistency, which are more reliable unsupervised signals, into a regular unsupervised depth estimation model. Specifically, we apply a reliable data augmentation mechanism to minimize the loss of the disparity map generated by the original image and the augmented image, respectively, which will enhance the robustness of the image in the prediction of color fluctuation. At the same time, we aggregate the features of different layers extracted by a pre-trained VGG16 network to explore the higher-level perceptual differences between the input image and the generated one. Ablation studies demonstrate the effectiveness of each components, and PDANet shows high-quality depth estimation results on the KITTI benchmark, which optimizes the state-of-the-art method from 0.114 to 0.084, measured by absolute relative error for depth estimation.

Download Full-text

Single image depth estimation using joint local-global features

2016 23rd International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr.2016.7899721 ◽

2016 ◽

Cited By ~ 2

Author(s):

H. Mohaghegh ◽

N. Karimi ◽

S.M.R. Soroushmehr ◽

S. Samavi ◽

K. Najarian

Keyword(s):

Depth Estimation ◽

Single Image ◽

Global Features ◽

Image Depth

Download Full-text

UW-GAN: Single Image Depth Estimation and Image Enhancement for Underwater Images

IEEE Transactions on Instrumentation and Measurement ◽

10.1109/tim.2021.3120130 ◽

2021 ◽

pp. 1-1

Author(s):

Praful Hambarde ◽

Subrahmanyam Murala ◽

Abhinav Dhall

Keyword(s):

Image Enhancement ◽

Depth Estimation ◽

Single Image ◽

Image Depth

Download Full-text

An Algorithm of Single Image Depth Estimation Based on MRF Model

Wireless and Satellite Systems - Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ◽

10.1007/978-3-030-19156-6_19 ◽

2019 ◽

pp. 198-207

Author(s):

Lizhi Zhang ◽

Yongchao Chen ◽

Lianding Niu ◽

Zhijie Zhao ◽

Xiaowei Han

Keyword(s):

Depth Estimation ◽

Single Image ◽

Image Depth

Download Full-text

Single Image Depth Estimation With Normal Guided Scale Invariant Deep Convolutional Fields

IEEE Transactions on Circuits and Systems for Video Technology ◽

10.1109/tcsvt.2017.2772892 ◽

2019 ◽

Vol 29 (1) ◽

pp. 80-92 ◽

Cited By ~ 3

Author(s):

Han Yan ◽

Xin Yu ◽

Yu Zhang ◽

Shunli Zhang ◽

Xiaolin Zhao ◽

...

Keyword(s):

Depth Estimation ◽

Single Image ◽

Scale Invariant ◽

Image Depth

Download Full-text

Deep Learning-Based Monocular Depth Estimation Methods—A State-of-the-Art Review

Sensors ◽

10.3390/s20082272 ◽

2020 ◽

Vol 20 (8) ◽

pp. 2272 ◽

Cited By ~ 5

Author(s):

Faisal Khan ◽

Saqib Salahuddin ◽

Hossein Javidnia

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Research Work ◽

Depth Estimation ◽

Autonomous Driving ◽

Estimation Methods ◽

Future Research ◽

Comprehensive Overview ◽

Ill Posed ◽

Monocular Depth

Monocular depth estimation from Red-Green-Blue (RGB) images is a well-studied ill-posed problem in computer vision which has been investigated intensively over the past decade using Deep Learning (DL) approaches. The recent approaches for monocular depth estimation mostly rely on Convolutional Neural Networks (CNN). Estimating depth from two-dimensional images plays an important role in various applications including scene reconstruction, 3D object-detection, robotics and autonomous driving. This survey provides a comprehensive overview of this research topic including the problem representation and a short description of traditional methods for depth estimation. Relevant datasets and 13 state-of-the-art deep learning-based approaches for monocular depth estimation are reviewed, evaluated and discussed. We conclude this paper with a perspective towards future research work requiring further investigation in monocular depth estimation challenges.

Download Full-text