scholarly journals Multi-Scale Spatio-Temporal Feature Extraction and Depth Estimation from Sequences by Ordinal Classification

Sensors ◽  
2020 ◽  
Vol 20 (7) ◽  
pp. 1979
Author(s):  
Yang Liu

Depth estimation is a key problem in 3D computer vision and has a wide variety of applications. In this paper we explore whether deep learning network can predict depth map accurately by learning multi-scale spatio-temporal features from sequences and recasting the depth estimation from a regression task to an ordinal classification task. We design an encoder-decoder network with several multi-scale strategies to improve its performance and extract spatio-temporal features with ConvLSTM. The results of our experiments show that the proposed method has an improvement of almost 10% in error metrics and up to 2% in accuracy metrics. The results also tell us that extracting spatio-temporal features can dramatically improve the performance in depth estimation task. We consider to extend this work to a self-supervised manner to get rid of the dependence on large-scale labeled data.

IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Xin Yang ◽  
Qingling Chang ◽  
Xinglin Liu ◽  
Siyuan He ◽  
Yan Cui

Sensors ◽  
2019 ◽  
Vol 19 (3) ◽  
pp. 500 ◽  
Author(s):  
Luca Palmieri ◽  
Gabriele Scrofani ◽  
Nicolò Incardona ◽  
Genaro Saavedra ◽  
Manuel Martínez-Corral ◽  
...  

Light field technologies have seen a rise in recent years and microscopy is a field where such technology has had a deep impact. The possibility to provide spatial and angular information at the same time and in a single shot brings several advantages and allows for new applications. A common goal in these applications is the calculation of a depth map to reconstruct the three-dimensional geometry of the scene. Many approaches are applicable, but most of them cannot achieve high accuracy because of the nature of such images: biological samples are usually poor in features and do not exhibit sharp colors like natural scene. Due to such conditions, standard approaches result in noisy depth maps. In this work, a robust approach is proposed where accurate depth maps can be produced exploiting the information recorded in the light field, in particular, images produced with Fourier integral Microscope. The proposed approach can be divided into three main parts. Initially, it creates two cost volumes using different focal cues, namely correspondences and defocus. Secondly, it applies filtering methods that exploit multi-scale and super-pixels cost aggregation to reduce noise and enhance the accuracy. Finally, it merges the two cost volumes and extracts a depth map through multi-label optimization.


Cells ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 1533 ◽  
Author(s):  
Carsten Beta ◽  
Nir S. Gov ◽  
Arik Yochelis

During the last decade, intracellular actin waves have attracted much attention due to their essential role in various cellular functions, ranging from motility to cytokinesis. Experimental methods have advanced significantly and can capture the dynamics of actin waves over a large range of spatio-temporal scales. However, the corresponding coarse-grained theory mostly avoids the full complexity of this multi-scale phenomenon. In this perspective, we focus on a minimal continuum model of activator–inhibitor type and highlight the qualitative role of mass conservation, which is typically overlooked. Specifically, our interest is to connect between the mathematical mechanisms of pattern formation in the presence of a large-scale mode, due to mass conservation, and distinct behaviors of actin waves.


Author(s):  
Ashwin Kannan ◽  
S. R. Chakravarthy

Incompressible large eddy simulations coupled with acoustics are performed to predict combustion noise and instability in a partially premixed based backward facing step combustor. The computational analysis adopts a simultaneous multi-scale spatio-temporal framework for flow and acoustics such that the flow/acoustics varies at a shorter/longer length scale and a longer/shorter time scale respectively. This engenders flow dilatation and acoustic Reynolds stress (ARS) as the external source terms in the acoustic energy and flow momentum respectively. Numerical results are presented for three cases, at a particular Reynolds number, wherein two of them constitute acoustically coupled (coupled long duct case) and its uncoupled counterpart (no acoustic feedback). The third corresponds to a shorter combustor length (coupled short duct case). These three cases contrast the strong acoustic feedback in the short duct case, both of which are compared with the acoustically uncoupled LES that is common to them. It is found that combustion occurs predominantly in the large-scale vortical structures in the coupled long duct case due to enhanced mixing between the reactants brought about by the strong acoustic feedback (ARS). Thus, the present work is able to not only distinguish between the flow and acoustic processes, but also handle both combustion noise and instability within the same framework.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 690
Author(s):  
Wenbo Sun ◽  
Zhi Gao ◽  
Jinqiang Cui ◽  
Bharath Ramesh ◽  
Bin Zhang ◽  
...  

Semantic segmentation is one of the most widely studied problems in computer vision communities, which makes a great contribution to a variety of applications. A lot of learning-based approaches, such as Convolutional Neural Network (CNN), have made a vast contribution to this problem. While rich context information of the input images can be learned from multi-scale receptive fields by convolutions with deep layers, traditional CNNs have great difficulty in learning the geometrical relationship and distribution of objects in the RGB image due to the lack of depth information, which may lead to an inferior segmentation quality. To solve this problem, we propose a method that improves segmentation quality with depth estimation on RGB images. Specifically, we estimate depth information on RGB images via a depth estimation network, and then feed the depth map into the CNN which is able to guide the semantic segmentation. Furthermore, in order to parse the depth map and RGB images simultaneously, we construct a multi-branch encoder–decoder network and fuse the RGB and depth features step by step. Extensive experimental evaluation on four baseline networks demonstrates that our proposed method can enhance the segmentation quality considerably and obtain better performance compared to other segmentation networks.


2021 ◽  
Vol 15 ◽  
Author(s):  
Davide Borra ◽  
Silvia Fantozzi ◽  
Elisa Magosso

Convolutional neural networks (CNNs), which automatically learn features from raw data to approximate functions, are being increasingly applied to the end-to-end analysis of electroencephalographic (EEG) signals, especially for decoding brain states in brain-computer interfaces (BCIs). Nevertheless, CNNs introduce a large number of trainable parameters, may require long training times, and lack in interpretability of learned features. The aim of this study is to propose a CNN design for P300 decoding with emphasis on its lightweight design while guaranteeing high performance, on the effects of different training strategies, and on the use of post-hoc techniques to explain network decisions. The proposed design, named MS-EEGNet, learned temporal features in two different timescales (i.e., multi-scale, MS) in an efficient and optimized (in terms of trainable parameters) way, and was validated on three P300 datasets. The CNN was trained using different strategies (within-participant and within-session, within-participant and cross-session, leave-one-subject-out, transfer learning) and was compared with several state-of-the-art (SOA) algorithms. Furthermore, variants of the baseline MS-EEGNet were analyzed to evaluate the impact of different hyper-parameters on performance. Lastly, saliency maps were used to derive representations of the relevant spatio-temporal features that drove CNN decisions. MS-EEGNet was the lightest CNN compared with the tested SOA CNNs, despite its multiple timescales, and significantly outperformed the SOA algorithms. Post-hoc hyper-parameter analysis confirmed the benefits of the innovative aspects of MS-EEGNet. Furthermore, MS-EEGNet did benefit from transfer learning, especially using a low number of training examples, suggesting that the proposed approach could be used in BCIs to accurately decode the P300 event while reducing calibration times. Representations derived from the saliency maps matched the P300 spatio-temporal distribution, further validating the proposed decoding approach. This study, by specifically addressing the aspects of lightweight design, transfer learning, and interpretability, can contribute to advance the development of deep learning algorithms for P300-based BCIs.


2018 ◽  
Author(s):  
Andreas Trier Poulsen ◽  
Andreas Pedroni ◽  
Nicolas Langer ◽  
Lars Kai Hansen

AbstractEEG microstate analysis offers a sparse characterisation of the spatio-temporal features of large-scale brain network activity. However, despite the concept of microstates is straight-forward and offers various quantifications of the EEG signal with a relatively clear neurophysiological interpretation, a few important aspects about the currently applied methods are not readily comprehensible. Here we aim to increase the transparency about the methods to facilitate widespread application and reproducibility of EEG microstate analysis by introducing a new EEGlab toolbox for Matlab. EEGlab and the Microstate toolbox are open source, allowing the user to keep track of all details in every analysis step. The toolbox is specifically designed to facilitate the development of new methods. While the toolbox can be controlled with a graphical user interface (GUI), making it easier for newcomers to take their first steps in exploring the possibilities of microstate analysis, the Matlab framework allows advanced users to create scripts to automatise analysis for multiple subjects to avoid tediously repeating steps for every subject. This manuscript provides an overview of the most commonly applied microstate methods as well as a tutorial consisting of a comprehensive walk-through of the analysis of a small, publicly available dataset.


Author(s):  
WEI JIANG ◽  
SHIGEKI SUGIMOTO ◽  
MASATOSHI OKUTOMI

In this paper, we present a novel approach to imaging a panoramic (360°) environment and computing its dense depth map. Our approach adopts a multi-baseline stereo strategy using a set of multi-perspective panoramas where large baseline lengths are available. We design two image acquisition rigs for capturing such multi-perspective panoramas. The first one is composed of two parallel stereo cameras. By rotating the rig about a vertical axis, we generate four multi-perspective panoramas by resampling the regular perspective images captured by the stereo cameras. Then a depth map is estimated from the four multi-perspective panoramas and an original perspective image using a multi-baseline matching technique with different types of epipolar constraints. The second one is composed of a single camera and two mirrors. By rotating the rig, we acquire a spatio-temporal volume that is made up of the sequential images captured by the camera. Then we estimate a depth map by extracting trajectories from the spatio-temporal volume by using a multi-baseline stereo technique by considering occlusions. We can consider both rotating rigs as a single rotating camera with a very large field of view (FOV), that offers a large baseline length in depth estimation. In addition, compared with a previous approach using two multi-perspective panoramas from a single rotating camera, our approach can reduce matching errors due to image noise, repeated patterns, and occlusions by multi-baseline stereo techniques. Experimental results using both synthetic and real images show that our approach produces high quality panoramic 3D reconstruction.


2018 ◽  
Vol 14 (12) ◽  
pp. 1915-1960 ◽  
Author(s):  
Rudolf Brázdil ◽  
Andrea Kiss ◽  
Jürg Luterbacher ◽  
David J. Nash ◽  
Ladislava Řezníčková

Abstract. The use of documentary evidence to investigate past climatic trends and events has become a recognised approach in recent decades. This contribution presents the state of the art in its application to droughts. The range of documentary evidence is very wide, including general annals, chronicles, memoirs and diaries kept by missionaries, travellers and those specifically interested in the weather; records kept by administrators tasked with keeping accounts and other financial and economic records; legal-administrative evidence; religious sources; letters; songs; newspapers and journals; pictographic evidence; chronograms; epigraphic evidence; early instrumental observations; society commentaries; and compilations and books. These are available from many parts of the world. This variety of documentary information is evaluated with respect to the reconstruction of hydroclimatic conditions (precipitation, drought frequency and drought indices). Documentary-based drought reconstructions are then addressed in terms of long-term spatio-temporal fluctuations, major drought events, relationships with external forcing and large-scale climate drivers, socio-economic impacts and human responses. Documentary-based drought series are also considered from the viewpoint of spatio-temporal variability for certain continents, and their employment together with hydroclimate reconstructions from other proxies (in particular tree rings) is discussed. Finally, conclusions are drawn, and challenges for the future use of documentary evidence in the study of droughts are presented.


Sign in / Sign up

Export Citation Format

Share Document