Monocular Depth Estimation Based on Multi-Scale Depth Map Fusion

Pyramid architecture is a useful strategy to fuse multi-scale features in deep monocular depth estimation approaches. However, most pyramid networks fuse features only within the adjacent stages in a pyramid structure. To take full advantage of the pyramid structure, inspired by the success of DenseNet, this paper presents DCPNet, a densely connected pyramid network that fuses multi-scale features from multiple stages of the pyramid structure. DCPNet not only performs feature fusion between the adjacent stages, but also non-adjacent stages. To fuse these features, we design a simple and effective dense connection module (DCM). In addition, we offer a new consideration of the common upscale operation in our approach. We believe DCPNet offers a more efficient way to fuse features from multiple scales in a pyramid-like network. We perform extensive experiments using both outdoor and indoor benchmark datasets (i.e., the KITTI and the NYU Depth V2 datasets) and DCPNet achieves the state-of-the-art results.

Download Full-text

Monocular Depth Estimation Based on Multi-Scale Graph Convolution Networks

IEEE Access ◽

10.1109/access.2019.2961606 ◽

2020 ◽

Vol 8 ◽

pp. 997-1009

Author(s):

Junwei Fu ◽

Jun Liang ◽

Ziyang Wang

Keyword(s):

Depth Estimation ◽

Multi Scale ◽

Monocular Depth

Download Full-text

Robust Depth Estimation for Light Field Microscopy

Sensors ◽

10.3390/s19030500 ◽

2019 ◽

Vol 19 (3) ◽

pp. 500 ◽

Cited By ~ 5

Author(s):

Luca Palmieri ◽

Gabriele Scrofani ◽

Nicolò Incardona ◽

Genaro Saavedra ◽

Manuel Martínez-Corral ◽

...

Keyword(s):

Light Field ◽

Three Dimensional ◽

Depth Map ◽

Depth Estimation ◽

Fourier Integral ◽

Single Shot ◽

Natural Scene ◽

Multi Scale ◽

Depth Maps ◽

New Applications

Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages and allows for new applications. A common goal in these applications is the calculation of a depth map to reconstruct the three-dimensional geometry of the scene. Many approaches are applicable, but most of them cannot achieve high accuracy because of the nature of such images: biological samples are usually poor in features and do not exhibit sharp colors like natural scene. Due to such conditions, standard approaches result in noisy depth maps. In this work, a robust approach is proposed where accurate depth maps can be produced exploiting the information recorded in the light field, in particular, images produced with Fourier integral Microscope. The proposed approach can be divided into three main parts. Initially, it creates two cost volumes using different focal cues, namely correspondences and defocus. Secondly, it applies filtering methods that exploit multi-scale and super-pixels cost aggregation to reduce noise and enhance the accuracy. Finally, it merges the two cost volumes and extracts a depth map through multi-label optimization.

Download Full-text

Structure-Aware Residual Pyramid Network for Monocular Depth Estimation

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/98 ◽

2019 ◽

Cited By ~ 4

Author(s):

Xiaotian Chen ◽

Xuejin Chen ◽

Zheng-Jun Zha

Keyword(s):

Local Structure ◽

Feature Fusion ◽

Depth Estimation ◽

Image Features ◽

Underlying Structure ◽

Complex Scene ◽

Multi Scale ◽

Depth Prediction ◽

Scale Structures ◽

Monocular Depth

Monocular depth estimation is an essential task for scene understanding. The underlying structure of objects and stuff in a complex scene is critical to recovering accurate and visually-pleasing depth maps. Global structure conveys scene layouts, while local structure reflects shape details. Recently developed approaches based on convolutional neural networks (CNNs) significantly improve the performance of depth estimation. However, few of them take into account multi-scale structures in complex scenes. In this paper, we propose a Structure-Aware Residual Pyramid Network (SARPN) to exploit multi-scale structures for accurate depth prediction. We propose a Residual Pyramid Decoder (RPD) which expresses global scene structure in upper levels to represent layouts, and local structure in lower levels to present shape details. At each level, we propose Residual Refinement Modules (RRM) that predict residual maps to progressively add finer structures on the coarser structure predicted at the upper level. In order to fully exploit multi-scale image features, an Adaptive Dense Feature Fusion (ADFF) module, which adaptively fuses effective features from all scales for inferring structures of each scale, is introduced. Experiment results on the challenging NYU-Depth v2 dataset demonstrate that our proposed approach achieves state-of-the-art performance in both qualitative and quantitative evaluation. The code is available at https://github.com/Xt-Chen/SARPN.

Download Full-text

BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation

10.1145/3474085.3475373 ◽

2021 ◽

Author(s):

Qi Tang ◽

Runmin Cong ◽

Ronghui Sheng ◽

Lingzhi He ◽

Dan Zhang ◽

...

Keyword(s):

Super Resolution ◽

Depth Map ◽

Depth Estimation ◽

Joint Learning ◽

Learning Network ◽

Monocular Depth

Download Full-text

The Constraints between Edge Depth and Uncertainty for Monocular Depth Estimation

Electronics ◽

10.3390/electronics10243153 ◽

2021 ◽

Vol 10 (24) ◽

pp. 3153

Author(s):

Shouying Wu ◽

Wei Li ◽

Binbin Liang ◽

Guoxin Huang

Keyword(s):

Error Rate ◽

Positive Impact ◽

Depth Estimation ◽

Multi Scale ◽

Teacher Networks ◽

Depth Images ◽

Teacher Student ◽

Object Edge ◽

Monocular Depth ◽

The Impact

The self-supervised monocular depth estimation paradigm has become an important branch of computer vision depth-estimation tasks. However, the depth estimation problem arising from object edge depth pulling or occlusion is still unsolved. The grayscale discontinuity of object edges leads to a relatively high depth uncertainty of pixels in these regions. We improve the geometric edge prediction results by taking uncertainty into account in the depth-estimation task. To this end, we explore how uncertainty affects this task and propose a new self-supervised monocular depth estimation technique based on multi-scale uncertainty. In addition, we introduce a teacher–student architecture in models and investigate the impact of different teacher networks on the depth and uncertainty results. We evaluate the performance of our paradigm in detail on the standard KITTI dataset. The experimental results show that the accuracy of our method increased from 87.7% to 88.2%, the AbsRel error rate decreased from 0.115 to 0.11, the SqRel error rate decreased from 0.903 to 0.822, and the RMSE error rate decreased from 4.863 to 4.686 compared with the benchmark Monodepth2. Our approach has a positive impact on the problem of texture replication or inaccurate object boundaries, producing sharper and smoother depth images.

Download Full-text

Joint Attention Mechanisms for Monocular Depth Estimation With Multi-Scale Convolutions and Adaptive Weight Adjustment

IEEE Access ◽

10.1109/access.2020.3030097 ◽

2020 ◽

Vol 8 ◽

pp. 184437-184450

Author(s):

Peng Liu ◽

Zonghua Zhang ◽

Zhaozong Meng ◽

Nan Gao

Keyword(s):

Joint Attention ◽

Depth Estimation ◽

Multi Scale ◽

Adaptive Weight ◽

Monocular Depth

Download Full-text

Deep Multi-Scale Architectures for Monocular Depth Estimation

2018 25th IEEE International Conference on Image Processing (ICIP) ◽

10.1109/icip.2018.8451408 ◽

2018 ◽

Cited By ~ 10

Author(s):

M. Moukari ◽

S. Picard ◽

L. Simon ◽

F. Jurie

Keyword(s):

Depth Estimation ◽

Multi Scale ◽

Monocular Depth

Download Full-text

Monocular Depth Estimation With Multi-Scale Feature Fusion

IEEE Signal Processing Letters ◽

10.1109/lsp.2021.3067498 ◽

2021 ◽

Vol 28 ◽

pp. 678-682

Author(s):

Xianfa Xu ◽

Zhe Chen ◽

Fuliang Yin

Keyword(s):

Feature Fusion ◽

Depth Estimation ◽

Scale Feature ◽

Multi Scale ◽

Monocular Depth

Download Full-text

EfficientNet-B0 Based Monocular Dense-Depth Map Estimation

Traitement du signal ◽

10.18280/ts.380524 ◽

2021 ◽

Vol 38 (5) ◽

pp. 1485-1493

Author(s):

Yasasvy Tadepalli ◽

Meenakshi Kollati ◽

Swaraja Kuraparthi ◽

Padmavathi Kora

Keyword(s):

Depth Map ◽

Depth Estimation ◽

Model Parameters ◽

Map Estimation ◽

Bilinear Method ◽

Regression Problem ◽

Actual Error ◽

Ill Posed ◽

Monocular Image ◽

Monocular Depth

Monocular depth estimation is a hot research topic in autonomous car driving. Deep convolution neural networks (DCNN) comprising encoder and decoder with transfer learning are exploited in the proposed work for monocular depth map estimation of two-dimensional images. Extracted CNN features from initial stages are later upsampled using a sequence of Bilinear UpSampling and convolution layers to reconstruct the depth map. The encoder forms the feature extraction part, and the decoder forms the image reconstruction part. EfficientNetB0, a new architecture is used with pretrained weights as encoder. It is a revolutionary architecture with smaller model parameters yet achieving higher efficiencies than the architectures of state-of-the-art, pretrained networks. EfficientNet-B0 is compared with two other pretrained networks, the DenseNet-121 and ResNet50 models. Each of these three models are used in encoding stage for features extraction followed by bilinear method of UpSampling in the decoder. The Monocular image is an ill-posed problem and is thus considered as a regression problem. So the metrics used in the proposed work are F1-score, Jaccard score and Mean Actual Error (MAE) etc., between the original and the reconstructed image. The results convey that EfficientNet-B0 outperforms in validation loss, F1-score and Jaccard score compared to DenseNet-121 and ResNet-50 models.

Download Full-text