Uncalibrated stereo rectification and disparity range stabilization: a comparison of different feature detectors

Perception

10.1093/oso/9780199674923.003.0025 ◽

2018 ◽

Author(s):

Joel Z. Leibo ◽

Tomaso Poggio

Keyword(s):

Computational Neuroscience ◽

Feature Detection ◽

Pedestrian Detection ◽

Detection Systems ◽

Catching Up ◽

Feature Detectors ◽

Recognition Systems ◽

Engineered Systems ◽

Perceptual Systems ◽

The Brain

This chapter provides an overview of biological perceptual systems and their underlying computational principles focusing on the sensory sheets of the retina and cochlea and exploring how complex feature detection emerges by combining simple feature detectors in a hierarchical fashion. We also explore how the microcircuits of the neocortex implement such schemes pointing out similarities to progress in the field of machine vision driven deep learning algorithms. We see signs that engineered systems are catching up with the brain. For example, vision-based pedestrian detection systems are now accurate enough to be installed as safety devices in (for now) human-driven vehicles and the speech recognition systems embedded in smartphones have become increasingly impressive. While not being entirely biologically based, we note that computational neuroscience, as described in this chapter, makes up a considerable portion of such systems’ intellectual pedigree.

Download Full-text

A Joint 2D-3D Complementary Network for Stereo Matching

Sensors ◽

10.3390/s21041430 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1430

Author(s):

Xiaogang Jia ◽

Wei Chen ◽

Zhengfa Liang ◽

Xin Luo ◽

Mingfei Wu ◽

...

Keyword(s):

Stereo Matching ◽

Computational Cost ◽

Research Field ◽

Disparity Map ◽

Improve Performance ◽

Cost Aggregation ◽

Disparity Range ◽

Public Datasets ◽

Coarse To Fine ◽

Speed And Accuracy

Stereo matching is an important research field of computer vision. Due to the dimension of cost aggregation, current neural network-based stereo methods are difficult to trade-off speed and accuracy. To this end, we integrate fast 2D stereo methods with accurate 3D networks to improve performance and reduce running time. We leverage a 2D encoder-decoder network to generate a rough disparity map and construct a disparity range to guide the 3D aggregation network, which can significantly improve the accuracy and reduce the computational cost. We use a stacked hourglass structure to refine the disparity from coarse to fine. We evaluated our method on three public datasets. According to the KITTI official website results, Our network can generate an accurate result in 80 ms on a modern GPU. Compared to other 2D stereo networks (AANet, DeepPruner, FADNet, etc.), our network has a big improvement in accuracy. Meanwhile, it is significantly faster than other 3D stereo networks (5× than PSMNet, 7.5× than CSN and 22.5× than GANet, etc.), demonstrating the effectiveness of our method.

Download Full-text

Unsupervised learning by competing hidden units

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1820458116 ◽

2019 ◽

Vol 116 (16) ◽

pp. 7723-7731 ◽

Cited By ~ 16

Author(s):

Dmitry Krotov ◽

John J. Hopfield

Keyword(s):

Learning Algorithm ◽

Lower Layer ◽

Learning Rule ◽

Backpropagation Algorithm ◽

Feedforward Networks ◽

Feature Detectors ◽

End To End ◽

Hidden Layer ◽

Full Network ◽

Global Inhibition

It is widely believed that end-to-end training with the backpropagation algorithm is essential for learning good feature detectors in early layers of artificial neural networks, so that these detectors are useful for the task performed by the higher layers of that neural network. At the same time, the traditional form of backpropagation is biologically implausible. In the present paper we propose an unusual learning rule, which has a degree of biological plausibility and which is motivated by Hebb’s idea that change of the synapse strength should be local—i.e., should depend only on the activities of the pre- and postsynaptic neurons. We design a learning algorithm that utilizes global inhibition in the hidden layer and is capable of learning early feature detectors in a completely unsupervised way. These learned lower-layer feature detectors can be used to train higher-layer weights in a usual supervised way so that the performance of the full network is comparable to the performance of standard feedforward networks trained end-to-end with a backpropagation algorithm on simple tasks.

Download Full-text

Benchmarking of Feature Detectors and Matchers using OpenCV-Python Wrapper

10.1109/itnt52450.2021.9649278 ◽

2021 ◽

Author(s):

Oleg Golovnin ◽

Dmitry Rybnikov

Keyword(s):

Feature Detectors

Download Full-text

A Proposed Pipelined-Architecture for FPGA-Based Affine-Invariant Feature Detectors

2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06) ◽

10.1109/cvprw.2006.19 ◽

2006 ◽

Cited By ~ 6

Author(s):

C. Cabani ◽

W.J. MacLean

Keyword(s):

Affine Invariant ◽

Pipelined Architecture ◽

Feature Detectors ◽

Invariant Feature

Download Full-text

An Edge-Sense Bidirectional Pyramid Network for Stereo Matching of VHR Remote Sensing Images

Remote Sensing ◽

10.3390/rs12244025 ◽

2020 ◽

Vol 12 (24) ◽

pp. 4025

Author(s):

Rongshu Tao ◽

Yuming Xiang ◽

Hongjian You

Keyword(s):

Remote Sensing ◽

Stereo Matching ◽

Tall Buildings ◽

Disparity Estimation ◽

Complex Structures ◽

Learning Networks ◽

Remote Sensing Images ◽

Essential Step ◽

Disparity Range ◽

The Cost

As an essential step in 3D reconstruction, stereo matching still faces unignorable problems due to the high resolution and complex structures of remote sensing images. Especially in occluded areas of tall buildings and textureless areas of waters and woods, precise disparity estimation has become a difficult but important task. In this paper, we develop a novel edge-sense bidirectional pyramid stereo matching network to solve the aforementioned problems. The cost volume is constructed from negative to positive disparities since the disparity range in remote sensing images varies greatly and traditional deep learning networks only work well for positive disparities. Then, the occlusion-aware maps based on the forward-backward consistency assumption are applied to reduce the influence of the occluded area. Moreover, we design an edge-sense smoothness loss to improve the performance of textureless areas while maintaining the main structure. The proposed network is compared with two baselines. The experimental results show that our proposed method outperforms two methods, DenseMapNet and PSMNet, in terms of averaged endpoint error (EPE) and the fraction of erroneous pixels (D1), and the improvements in occluded and textureless areas are significant.

Download Full-text

A Sparse Generative Model of V1 Simple Cells with Intrinsic Plasticity

Neural Computation ◽

10.1162/neco.2007.02-07-472 ◽

2008 ◽

Vol 20 (5) ◽

pp. 1261-1284 ◽

Cited By ~ 13

Author(s):

Cornelius Weber ◽

Jochen Triesch

Keyword(s):

Dynamic Environment ◽

Generative Model ◽

Intrinsic Excitability ◽

Natural Image ◽

Tilt Aftereffect ◽

Simple Cells ◽

Feature Detectors ◽

Edge Detectors ◽

Intrinsic Plasticity ◽

Slow Timescale

Current models for learning feature detectors work on two timescales: on a fast timescale, the internal neurons' activations adapt to the current stimulus; on a slow timescale, the weights adapt to the statistics of the set of stimuli. Here we explore the adaptation of a neuron's intrinsic excitability, termed intrinsic plasticity, which occurs on a separate timescale. Here, a neuron maintains homeostasis of an exponentially distributed firing rate in a dynamic environment. We exploit this in the context of a generative model to impose sparse coding. With natural image input, localized edge detectors emerge as models of V1 simple cells. An intermediate timescale for the intrinsic plasticity parameters allows modeling aftereffects. In the tilt aftereffect, after a viewer adapts to a grid of a certain orientation, grids of a nearby orientation will be perceived as tilted away from the adapted orientation. Our results show that adapting the neurons' gain-parameter but not the threshold-parameter accounts for this effect. It occurs because neurons coding for the adapting stimulus attenuate their gain, while others increase it. Despite its simplicity and low maintenance, the intrinsic plasticity model accounts for more experimental details than previous models without this mechanism.

Download Full-text

Perturbing Line Pictures for Identification of Visual Features and Their Syntax

Perception ◽

10.1068/p130675 ◽

1984 ◽

Vol 13 (6) ◽

pp. 675-686 ◽

Cited By ~ 1

Author(s):

Eg G J Eijkman

Keyword(s):

Perturbation Method ◽

Contextual Factors ◽

Form Perception ◽

Visual Features ◽

Multivariate Methods ◽

Point Scale ◽

Visual Form ◽

Feature Detectors ◽

Visual Form Perception

Experiments are reported in which line pictures were perturbed by omission or displacement of a combination of single pixels, fragments of lines, contours, and whole figures. Different effects of perturbation were expected by selectively violating visual syntactic rules or by impeding the contribution of certain feature detectors. The deterioration of the perturbed picture was measured according to standard psychophysical methods by rating on a 5-point scale. Multivariate methods were used to single out the relative effects of perturbation by, respectively, a set of single pixels, line fragments, contours and whole figures. Lines, as opposed to loose pixels, are clearly powerful descriptors of the pictures; contours or whole figures do not add significantly to what lines already describe. Different effects were observed if perturbations were dislocations rather than removals. Then contours and whole figures showed a typical disrupting effect compared to line fragments. These results have consequences for the development of a syntax of visual form perception. The perturbation method seems appropriate for identifying features or syntactic rules, although the results are dependent on a number of environmental and contextual factors.

Download Full-text