BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation

Monocular depth estimation is a hot research topic in autonomous car driving. Deep convolution neural networks (DCNN) comprising encoder and decoder with transfer learning are exploited in the proposed work for monocular depth map estimation of two-dimensional images. Extracted CNN features from initial stages are later upsampled using a sequence of Bilinear UpSampling and convolution layers to reconstruct the depth map. The encoder forms the feature extraction part, and the decoder forms the image reconstruction part. EfficientNetB0, a new architecture is used with pretrained weights as encoder. It is a revolutionary architecture with smaller model parameters yet achieving higher efficiencies than the architectures of state-of-the-art, pretrained networks. EfficientNet-B0 is compared with two other pretrained networks, the DenseNet-121 and ResNet50 models. Each of these three models are used in encoding stage for features extraction followed by bilinear method of UpSampling in the decoder. The Monocular image is an ill-posed problem and is thus considered as a regression problem. So the metrics used in the proposed work are F1-score, Jaccard score and Mean Actual Error (MAE) etc., between the original and the reconstructed image. The results convey that EfficientNet-B0 outperforms in validation loss, F1-score and Jaccard score compared to DenseNet-121 and ResNet-50 models.

Download Full-text

CNN Based Monocular Depth Estimation

E3S Web of Conferences ◽

10.1051/e3sconf/202130901070 ◽

2021 ◽

Vol 309 ◽

pp. 01070

Author(s):

K. Swaraja ◽

K. Naga Siva Pavan ◽

S. Suryakanth Reddy ◽

K. Ajay ◽

P. Uday Kiran Reddy ◽

...

Keyword(s):

High Resolution ◽

Depth Map ◽

Depth Estimation ◽

Learning Approaches ◽

Depth Estimate ◽

High Performing ◽

Depth Measurement ◽

Ill Posed ◽

Monocular Depth ◽

Public Datasets

In several applications, such as scene interpretation and reconstruction, precise depth measurement from images is a significant challenge. Current depth estimate techniques frequently provide fuzzy, low-resolution estimates. With the use of transfer learning, this research executes a convolutional neural network for generating a high-resolution depth map from a single RGB image. With a typical encoder-decoder architecture, when initializing the encoder, we use features extracted from high-performing pre-trained networks, as well as augmentation and training procedures that lead to more accurate outcomes. We demonstrate how, even with a very basic decoder, our approach can provide complete high-resolution depth maps. A wide number of deep learning approaches have recently been presented, and they have showed significant promise in dealing with the classical ill-posed issue. The studies are carried out using KITTI and NYU Depth v2, two widely utilized public datasets. We also examine the errors created by various models in order to expose the shortcomings of present approaches which accomplishes viable performance on KITTI besides NYU Depth v2.

Download Full-text

Unsupervised Monocular Depth Estimation Based on Residual Neural Network of Coarse–Refined Feature Extractions for Drone

Electronics ◽

10.3390/electronics8101179 ◽

2019 ◽

Vol 8 (10) ◽

pp. 1179 ◽

Cited By ~ 1

Author(s):

Tao Huang ◽

Shuanfeng Zhao ◽

Longlong Geng ◽

Qian Xu

Keyword(s):

Neural Network ◽

Image Reconstruction ◽

Depth Map ◽

Ground Truth ◽

Depth Estimation ◽

Input Image ◽

Superior Performance ◽

Estimation Methods ◽

Depth Information ◽

Monocular Depth

To take full advantage of the information of images captured by drones and given that most existing monocular depth estimation methods based on supervised learning require vast quantities of corresponding ground truth depth data for training, the model of unsupervised monocular depth estimation based on residual neural network of coarse–refined feature extractions for drone is therefore proposed. As a virtual camera is introduced through a deep residual convolution neural network based on coarse–refined feature extractions inspired by the principle of binocular depth estimation, the unsupervised monocular depth estimation has become an image reconstruction problem. To improve the performance of our model for monocular depth estimation, the following innovations are proposed. First, the pyramid processing for input image is proposed to build the topological relationship between the resolution of input image and the depth of input image, which can improve the sensitivity of depth information from a single image and reduce the impact of input image resolution on depth estimation. Second, the residual neural network of coarse–refined feature extractions for corresponding image reconstruction is designed to improve the accuracy of feature extraction and solve the contradiction between the calculation time and the numbers of network layers. In addition, to predict high detail output depth maps, the long skip connections between corresponding layers in the neural network of coarse feature extractions and deconvolution neural network of refined feature extractions are designed. Third, the loss of corresponding image reconstruction based on the structural similarity index (SSIM), the loss of approximate disparity smoothness and the loss of depth map are united as a novel training loss to better train our model. The experimental results show that our model has superior performance on the KITTI dataset composed by corresponding left view and right view and Make3D dataset composed by image and corresponding ground truth depth map compared to the state-of-the-art monocular depth estimation methods and basically meet the requirements for depth information of images captured by drones when our model is trained on KITTI.

Download Full-text

Super-Resolution for Monocular Depth Estimation With Multi-Scale Sub-Pixel Convolutions and a Smoothness Constraint

IEEE Access ◽

10.1109/access.2019.2894651 ◽

2019 ◽

Vol 7 ◽

pp. 16323-16335 ◽

Cited By ~ 7

Author(s):

Shiyu Zhao ◽

Lin Zhang ◽

Ying Shen ◽

Shengjie Zhao ◽

Huijuan Zhang

Keyword(s):

Super Resolution ◽

Depth Estimation ◽

Multi Scale ◽

Smoothness Constraint ◽

Monocular Depth

Download Full-text

A Joint Learning-Based Method for Multi-view Depth Map Super Resolution

2013 2nd IAPR Asian Conference on Pattern Recognition ◽

10.1109/acpr.2013.89 ◽

2013 ◽

Cited By ~ 1

Author(s):

Jing Li ◽

Zhichao Lu ◽

Gang Zeng ◽

Rui Gan ◽

Long Wang ◽

...

Keyword(s):

Super Resolution ◽

Depth Map ◽

Joint Learning

Download Full-text

Monocular Depth Estimation using Transfer learning-An Overview

E3S Web of Conferences ◽

10.1051/e3sconf/202130901069 ◽

2021 ◽

Vol 309 ◽

pp. 01069

Author(s):

K. Swaraja ◽

V. Akshitha ◽

K. Pranav ◽

B. Vyshnavi ◽

V. Sai Akhil ◽

...

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

Deep Neural Networks ◽

Depth Estimation ◽

Depth Information ◽

Learning Approaches ◽

Learning Network ◽

Depth Maps ◽

Ill Posed ◽

Monocular Depth

Depth estimation is a computer vision technique that is critical for autonomous schemes for sensing their surroundings and predict their own condition. Traditional estimating approaches, such as structure from motion besides stereo vision similarity, rely on feature communications from several views to provide depth information. In the meantime, the depth maps anticipated are scarce. Gathering depth information via monocular depth estimation is an ill-posed issue, according to a substantial corpus of deep learning approaches recently suggested. Estimation of Monocular depth with deep learning has gotten a lot of interest in current years, thanks to the fast expansion of deep neural networks, and numerous strategies have been developed to solve this issue. In this study, we want to give a comprehensive assessment of the methodologies often used in the estimation of monocular depth. The purpose of this study is to look at recent advances in deep learning-based estimation of monocular depth. To begin, we'll go through the various depth estimation techniques and datasets for monocular depth estimation. A complete overview of multiple deep learning methods that use transfer learning Network designs, including several combinations of encoders and decoders, is offered. In addition, multiple deep learning-based monocular depth estimation approaches and models are classified. Finally, the use of transfer learning approaches to monocular depth estimation is illustrated.

Download Full-text