monocular video
Recently Published Documents


TOTAL DOCUMENTS

202
(FIVE YEARS 53)

H-INDEX

19
(FIVE YEARS 3)

Author(s):  
B.A. Skorohod ◽  

The article proposes new algorithms for estimating the coordinates of objects (both linear and angular) relative to the coordinate system related to the video camera. A two-step algorithm is proposed. At the first stage, the processing of images coming from the camera is performed – the selection of an area be-longing to the sea surface in the image, the detection and video tracking of objects, the determination of azimuth and elevation angles from the obtained images. Our approach is based on the representation of elevation and azimuth angles in the form of non-stationary autoregression models, recurrent estimation of their parameters and subsequent estimation of the object coordinates.


2021 ◽  
Author(s):  
Zhimin Zhang ◽  
◽  
Jianzhong Qiao ◽  
Shukuan Lin ◽  
◽  
...  

The depth and pose information are the basic issues in the field of robotics, autonomous driving, and virtual reality, and are also the focus and difficult issues of computer vision research. The supervised monocular depth and pose estimation learning are not feasible in environments where labeled data is not abundant. Self-supervised monocular video methods can learn effectively only by applying photometric constraints without expensive ground true depth label constraints, which results in an inefficient training process and suboptimal estimation accuracy. To solve these problems, a monocular weakly supervised depth and pose estimation method based on multi-information fusion is proposed in this paper. First, we design a high-precision stereo matching method to generate a depth and pose data as the "Ground Truth" labels to solve the problem that the ground truth labels are difficult to obtain. Then, we construct a multi-information fusion network model based on the "Ground truth" labels, video sequence, and IMU information to improve the estimation accuracy. Finally, we design the loss function of supervised cues based on "Ground Truth" labels cues and self-supervised cues to optimize our model. In the testing phase, the network model can separately output high-precision depth and pose data from a monocular video sequence. The resulting model outperforms mainstream monocular depth and poses estimation methods as well as the partial stereo matching method in the challenging KITTI dataset by only using a small number of real training data(200 pairs).


2021 ◽  
Vol 40 (6) ◽  
pp. 1-14
Author(s):  
Ri Yu ◽  
Hwangpil Park ◽  
Jehee Lee

2021 ◽  
Author(s):  
Chen Guo ◽  
Xu Chen ◽  
Jie Song ◽  
Otmar Hilliges

2021 ◽  
Vol 8 (3) ◽  
pp. 15-27
Author(s):  
Mohamed N. Sweilam ◽  
Nikolay Tolstokulakov

Depth estimation has made great progress in the last few years due to its applications in robotics science and computer vision. Various methods have been implemented and enhanced to estimate the depth without flickers and missing holes. Despite this progress, it is still one of the main challenges for researchers, especially for the video applications which have more complexity of the neural network which af ects the run time. Moreover to use such input like monocular video for depth estimation is considered an attractive idea, particularly for hand-held devices such as mobile phones, they are very popular for capturing pictures and videos, in addition to having a limited amount of RAM. Here in this work, we focus on enhancing the existing consistent depth estimation for monocular videos approach to be with less usage of RAM and with using less number of parameters without having a significant reduction in the quality of the depth estimation.


2021 ◽  
Author(s):  
Bin Ji ◽  
Chen Yang ◽  
Yao Shunyu ◽  
Ye Pan

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Arash Azhand ◽  
Sophie Rabe ◽  
Swantje Müller ◽  
Igor Sattler ◽  
Anika Heimann-Steinert

AbstractDespite its paramount importance for manifold use cases (e.g., in the health care industry, sports, rehabilitation and fitness assessment), sufficiently valid and reliable gait parameter measurement is still limited to high-tech gait laboratories mostly. Here, we demonstrate the excellent validity and test–retest repeatability of a novel gait assessment system which is built upon modern convolutional neural networks to extract three-dimensional skeleton joints from monocular frontal-view videos of walking humans. The validity study is based on a comparison to the GAITRite pressure-sensitive walkway system. All measured gait parameters (gait speed, cadence, step length and step time) showed excellent concurrent validity for multiple walk trials at normal and fast gait speeds. The test–retest-repeatability is on the same level as the GAITRite system. In conclusion, we are convinced that our results can pave the way for cost, space and operationally effective gait analysis in broad mainstream applications. Most sensor-based systems are costly, must be operated by extensively trained personnel (e.g., motion capture systems) or—even if not quite as costly—still possess considerable complexity (e.g., wearable sensors). In contrast, a video sufficient for the assessment method presented here can be obtained by anyone, without much training, via a smartphone camera.


Sign in / Sign up

Export Citation Format

Share Document