scholarly journals EANet: Depth Estimation Based on EPI of Light Field

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yunzhang Du ◽  
Qian Zhang ◽  
Dingkang Hua ◽  
Jiaqi Hou ◽  
Bin Wang ◽  
...  

The light field is an important way to record the spatial information of the target scene. The purpose of this paper is to obtain depth information through the processing of light field information and provide a basis for intelligent medical treatment. In this paper, we first design an attention module to extract the features of light field images and connect all the features as a feature map to generate an attention image. Then, the attention map is integrated with the convolution layer in the neural network in the form of weights to enhance the weight of the subaperture viewpoint, which is more meaningful for depth estimation. Finally, the obtained initial depth results were optimized. The experimental results show that the MSE, PSNR, and SSIM of the depth map obtained by this method are increased by about 13%, 10 dB, and 4%, respectively, in some scenarios with good performance.

2019 ◽  
Vol 224 ◽  
pp. 04005
Author(s):  
Nikolay Gapon ◽  
Roman Sizyakin ◽  
Marina Zhdanova ◽  
Oksana Balabaeva ◽  
Yigang Cen

This paper proposes a method for reconstructing a depth map obtained using a stereo pair image. The proposed approach is based on a geometric model for the synthesis of patches. The entire image is preliminarily divided into blocks of different size, where large blocks are used to restore homogeneous areas, and small blocks are used to restore details of the image structure. Lost pixels are recovered by copying the pixel values from the source based on the similarity criterion. We used a trained neural network to select the “best like” patch. Experimental results show that the proposed method gives better results than other modern methods, both in subjective and objective measurements for reconstructing a depth map.


Electronics ◽  
2019 ◽  
Vol 8 (10) ◽  
pp. 1179 ◽  
Author(s):  
Tao Huang ◽  
Shuanfeng Zhao ◽  
Longlong Geng ◽  
Qian Xu

To take full advantage of the information of images captured by drones and given that most existing monocular depth estimation methods based on supervised learning require vast quantities of corresponding ground truth depth data for training, the model of unsupervised monocular depth estimation based on residual neural network of coarse–refined feature extractions for drone is therefore proposed. As a virtual camera is introduced through a deep residual convolution neural network based on coarse–refined feature extractions inspired by the principle of binocular depth estimation, the unsupervised monocular depth estimation has become an image reconstruction problem. To improve the performance of our model for monocular depth estimation, the following innovations are proposed. First, the pyramid processing for input image is proposed to build the topological relationship between the resolution of input image and the depth of input image, which can improve the sensitivity of depth information from a single image and reduce the impact of input image resolution on depth estimation. Second, the residual neural network of coarse–refined feature extractions for corresponding image reconstruction is designed to improve the accuracy of feature extraction and solve the contradiction between the calculation time and the numbers of network layers. In addition, to predict high detail output depth maps, the long skip connections between corresponding layers in the neural network of coarse feature extractions and deconvolution neural network of refined feature extractions are designed. Third, the loss of corresponding image reconstruction based on the structural similarity index (SSIM), the loss of approximate disparity smoothness and the loss of depth map are united as a novel training loss to better train our model. The experimental results show that our model has superior performance on the KITTI dataset composed by corresponding left view and right view and Make3D dataset composed by image and corresponding ground truth depth map compared to the state-of-the-art monocular depth estimation methods and basically meet the requirements for depth information of images captured by drones when our model is trained on KITTI.


2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Jianzhong Yuan ◽  
Wujie Zhou ◽  
Sijia Lv ◽  
Yuzhen Chen

In order to obtain the distances between the surrounding objects and the vehicle in the traffic scene in front of the vehicle, a monocular visual depth estimation method based on the depthwise separable convolutional neural network is proposed in this study. First, features containing shallow depth information were extracted from the RGB images using the convolution layers and maximum pooling layers. Subsampling operations were also performed on these images. Subsequently, features containing advanced depth information were extracted using a block based on an ensemble of convolution layers and a block based on depth separable convolution layers. The output from all different blocks is combined afterwards. Finally, transposed convolution layers were used for upsampling the feature maps to the same size with the original RGB image. During the upsampling process, skip connections were used to merge the features containing shallow depth information that was obtained from the convolution operation through the depthwise separable convolution layers. The depthwise separable convolution layers can provide more accurate depth information features for estimating the monocular visual depth. At the same time, they require reduced computational cost and fewer parameter numbers while providing a similar level (or slightly better) computing performance. Integrating multiple simple convolutions into a block not only increases the overall depth of the neural network but also enables a more accurate extraction of the advanced features in the neural network. Combining the output from multiple blocks can prevent the loss of features containing important depth information. The testing results show that the depthwise separable convolutional neural network provides a superior performance than the other monocular visual depth estimation methods. Therefore, applying depthwise separable convolution layers in the neural network is a more effective and accurate approach for estimating the visual depth.


Sensors ◽  
2019 ◽  
Vol 19 (3) ◽  
pp. 500 ◽  
Author(s):  
Luca Palmieri ◽  
Gabriele Scrofani ◽  
Nicolò Incardona ◽  
Genaro Saavedra ◽  
Manuel Martínez-Corral ◽  
...  

Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages and allows for new applications. A common goal in these applications is the calculation of a depth map to reconstruct the three-dimensional geometry of the scene. Many approaches are applicable, but most of them cannot achieve high accuracy because of the nature of such images: biological samples are usually poor in features and do not exhibit sharp colors like natural scene. Due to such conditions, standard approaches result in noisy depth maps. In this work, a robust approach is proposed where accurate depth maps can be produced exploiting the information recorded in the light field, in particular, images produced with Fourier integral Microscope. The proposed approach can be divided into three main parts. Initially, it creates two cost volumes using different focal cues, namely correspondences and defocus. Secondly, it applies filtering methods that exploit multi-scale and super-pixels cost aggregation to reduce noise and enhance the accuracy. Finally, it merges the two cost volumes and extracts a depth map through multi-label optimization.


Author(s):  
Agota Faluvegi ◽  
Quentin Bolsee ◽  
Sergiu Nedevschi ◽  
Vasile-Teodor Dadarlat ◽  
Adrian Munteanu

2020 ◽  
Vol 9 (2) ◽  
pp. 74
Author(s):  
Eric Hsueh-Chan Lu ◽  
Jing-Mei Ciou

With the rapid development of surveying and spatial information technologies, more and more attention has been given to positioning. In outdoor environments, people can easily obtain positioning services through global navigation satellite systems (GNSS). In indoor environments, the GNSS signal is often lost, while other positioning problems, such as dead reckoning and wireless signals, will face accumulated errors and signal interference. Therefore, this research uses images to realize a positioning service. The main concept of this work is to establish a model for an indoor field image and its coordinate information and to judge its position by image eigenvalue matching. Based on the architecture of PoseNet, the image is input into a 23-layer convolutional neural network according to various sizes to train end-to-end location identification tasks, and the three-dimensional position vector of the camera is regressed. The experimental data are taken from the underground parking lot and the Palace Museum. The preliminary experimental results show that this new method designed by us can effectively improve the accuracy of indoor positioning by about 20% to 30%. In addition, this paper also discusses other architectures, field sizes, camera parameters, and error corrections for this neural network system. The preliminary experimental results show that the angle error correction method designed by us can effectively improve positioning by about 20%.


Author(s):  
N. Zeller ◽  
F. Quint ◽  
U. Stilla

In this article we present a new method for visual odometry based on a focused plenoptic camera. This method fuses the depth data gained by a monocular Simultaneous Localization and Mapping (SLAM) algorithm and the one received from a focused plenoptic camera. Our algorithm uses the depth data and the totally focused images supplied by the plenoptic camera to run a real-time semi-dense direct SLAM algorithm. Based on this combined approach, the scale ambiguity of a monocular SLAM system can be overcome. Furthermore, the additional light-field information highly improves the tracking capabilities of the algorithm. Thus, visual odometry even for narrow field of view (FOV) cameras is possible. We show that not only tracking profits from the additional light-field information. By accumulating the depth information over multiple tracked images, also the depth accuracy of the focused plenoptic camera can be highly improved. This novel approach improves the depth error by one order of magnitude compared to the one received from a single light-field image.


2021 ◽  
Vol 12 ◽  
Author(s):  
Caroline Pinte ◽  
Mathis Fleury ◽  
Pierre Maurel

The simultaneous acquisition of electroencephalographic (EEG) signals and functional magnetic resonance images (fMRI) aims to measure brain activity with good spatial and temporal resolution. This bimodal neuroimaging can bring complementary and very relevant information in many cases and in particular for epilepsy. Indeed, it has been shown that it can facilitate the localization of epileptic networks. Regarding the EEG, source localization requires the resolution of a complex inverse problem that depends on several parameters, one of the most important of which is the position of the EEG electrodes on the scalp. These positions are often roughly estimated using fiducial points. In simultaneous EEG-fMRI acquisitions, specific MRI sequences can provide valuable spatial information. In this work, we propose a new fully automatic method based on neural networks to segment an ultra-short echo-time MR volume in order to retrieve the coordinates and labels of the EEG electrodes. It consists of two steps: a segmentation of the images by a neural network, followed by the registration of an EEG template on the obtained detections. We trained the neural network using 37 MR volumes and then we tested our method on 23 new volumes. The results show an average detection accuracy of 99.7% with an average position error of 2.24 mm, as well as 100% accuracy in the labeling.


2021 ◽  
Vol 8 (3) ◽  
pp. 15-27
Author(s):  
Mohamed N. Sweilam ◽  
Nikolay Tolstokulakov

Depth estimation has made great progress in the last few years due to its applications in robotics science and computer vision. Various methods have been implemented and enhanced to estimate the depth without flickers and missing holes. Despite this progress, it is still one of the main challenges for researchers, especially for the video applications which have more complexity of the neural network which af ects the run time. Moreover to use such input like monocular video for depth estimation is considered an attractive idea, particularly for hand-held devices such as mobile phones, they are very popular for capturing pictures and videos, in addition to having a limited amount of RAM. Here in this work, we focus on enhancing the existing consistent depth estimation for monocular videos approach to be with less usage of RAM and with using less number of parameters without having a significant reduction in the quality of the depth estimation.


Sensors ◽  
2019 ◽  
Vol 19 (20) ◽  
pp. 4434 ◽  
Author(s):  
Sangwon Kim ◽  
Jaeyeal Nam ◽  
Byoungchul Ko

Depth estimation is a crucial and fundamental problem in the computer vision field. Conventional methods re-construct scenes using feature points extracted from multiple images; however, these approaches require multiple images and thus are not easily implemented in various real-time applications. Moreover, the special equipment required by hardware-based approaches using 3D sensors is expensive. Therefore, software-based methods for estimating depth from a single image using machine learning or deep learning are emerging as new alternatives. In this paper, we propose an algorithm that generates a depth map in real time using a single image and an optimized lightweight efficient neural network (L-ENet) algorithm instead of physical equipment, such as an infrared sensor or multi-view camera. Because depth values have a continuous nature and can produce locally ambiguous results, pixel-wise prediction with ordinal depth range classification was applied in this study. In addition, in our method various convolution techniques are applied to extract a dense feature map, and the number of parameters is greatly reduced by reducing the network layer. By using the proposed L-ENet algorithm, an accurate depth map can be generated from a single image quickly and, in a comparison with the ground truth, we can produce depth values closer to those of the ground truth with small errors. Experiments confirmed that the proposed L-ENet can achieve a significantly improved estimation performance over the state-of-the-art algorithms in depth estimation based on a single image.


Sign in / Sign up

Export Citation Format

Share Document