Monocular depth estimation with geometrical guidance using a multi-level convolutional neural network

2019 ◽  
Vol 84 ◽  
pp. 105714 ◽  
Author(s):  
Hamed Amini Amirkolaee ◽  
Hossein Arefi
2021 ◽  
Vol 7 (4) ◽  
pp. 61
Author(s):  
David Urban ◽  
Alice Caplier

As difficult vision-based tasks like object detection and monocular depth estimation are making their way in real-time applications and as more light weighted solutions for autonomous vehicles navigation systems are emerging, obstacle detection and collision prediction are two very challenging tasks for small embedded devices like drones. We propose a novel light weighted and time-efficient vision-based solution to predict Time-to-Collision from a monocular video camera embedded in a smartglasses device as a module of a navigation system for visually impaired pedestrians. It consists of two modules: a static data extractor made of a convolutional neural network to predict the obstacle position and distance and a dynamic data extractor that stacks the obstacle data from multiple frames and predicts the Time-to-Collision with a simple fully connected neural network. This paper focuses on the Time-to-Collision network’s ability to adapt to new sceneries with different types of obstacles with supervised learning.


2021 ◽  
Vol 7 (4) ◽  
pp. 117
Author(s):  
Linling Fang ◽  
Yingle Fan

<p>A biomimetic vision computing model based on multi-level feature channel optimization coding is proposed and applied to image contour detection, combining the end-to-end detection method of full convolutional neural network and the traditional contour detection method based on biological vision mechanism. Considering the effectiveness of the Gabor filter in perceiving the scale and direction of the image target, the Gabor filter is introduced to simulate the multi-level feature response on the visual path. The optimal scale and direction of the Gabor filter are obtained based on the similarity index, and they are used as the frequency separation parameter of the NSCT transform. The contour sub-image obtained by the NSCT transform is combined with the original image for feature enhancement and fusion to realize the primary contour response. The low-dimensional and low-redundancy primary contour response is used as the input sample of the network model to relieve network pressure and reduce computational complexity. A fully improved convolutional neural network model is constructed for multi-scale training, through feature encoder to feature decoder, to achieve end-to-end pixel prediction, and obtain a complete and continuous detection image of the subject contour. Using the BSDS500 atlas as the experimental sample, the average accuracy index is 0.85, which runs on the device CPU at a detection rate of 20+ FPS to achieve a good balance between training efficiency and detection effect.</p>


2018 ◽  
Vol 55 (11) ◽  
pp. 111507
Author(s):  
鲍振强 Bao Zhenqiang ◽  
李艾华 Li Aihua ◽  
崔智高 Cui Zhigao ◽  
苏延召 Su Yanzhao ◽  
郑勇 Zheng Yong

Sign in / Sign up

Export Citation Format

Share Document