scholarly journals Number detectors spontaneously emerge in a deep neural network designed for visual object recognition

2019 ◽  
Vol 5 (5) ◽  
pp. eaav7903 ◽  
Author(s):  
Khaled Nasr ◽  
Pooja Viswanathan ◽  
Andreas Nieder

Humans and animals have a “number sense,” an innate capability to intuitively assess the number of visual items in a set, its numerosity. This capability implies that mechanisms to extract numerosity indwell the brain’s visual system, which is primarily concerned with visual object recognition. Here, we show that network units tuned to abstract numerosity, and therefore reminiscent of real number neurons, spontaneously emerge in a biologically inspired deep neural network that was merely trained on visual object recognition. These numerosity-tuned units underlay the network’s number discrimination performance that showed all the characteristics of human and animal number discriminations as predicted by the Weber-Fechner law. These findings explain the spontaneous emergence of the number sense based on mechanisms inherent to the visual system.

NeuroImage ◽  
2018 ◽  
Vol 180 ◽  
pp. 253-266 ◽  
Author(s):  
K. Seeliger ◽  
M. Fritsche ◽  
U. Güçlü ◽  
S. Schoenmakers ◽  
J.-M. Schoffelen ◽  
...  

1997 ◽  
Vol 9 (4) ◽  
pp. 777-804 ◽  
Author(s):  
Bartlett W. Mel

Severe architectural and timing constraints within the primate visual system support the conjecture that the early phase of object recognition in the brain is based on a feedforward feature-extraction hierarchy. To assess the plausibility of this conjecture in an engineering context, a difficult three-dimensional object recognition domain was developed to challenge a pure feedforward, receptive-field based recognition model called SEEMORE. SEEMORE is based on 102 viewpoint-invariant nonlinear filters that as a group are sensitive to contour, texture, and color cues. The visual domain consists of 100 real objects of many different types, including rigid (shovel), nonrigid (telephone cord), and statistical (maple leaf cluster) objects and photographs of complex scenes. Objects were in dividually presented in color video images under normal room lighting conditions. Based on 12 to 36 training views, SEEMORE was required to recognize unnormalized test views of objects that could vary in position, orientation in the image plane and in depth, and scale (factor of 2); for non rigid objects, recognition was also tested under gross shape deformations. Correct classification performance on a test set consisting of 600 novel object views was 97 percent (chance was 1 percent) and was comparable for the subset of 15 nonrigid objects. Performance was also measured under a variety of image degradation conditions, including partial occlusion, limited clutter, color shift, and additive noise. Generalization behavior and classification errors illustrate the emergence of several striking natural shape categories that are not explicitly encoded in the dimensions of the feature space. It is concluded that in the light of the vast hardware resources available in the ventral stream of the primate visual system relative to those exercised here, the appealingly simple feature-space conjecture remains worthy of serious consideration as a neurobiological model.


2000 ◽  
Vol 12 (11) ◽  
pp. 2547-2572 ◽  
Author(s):  
Edmund T. Rolls ◽  
T. Milward

VisNet2 is a model to investigate some aspects of invariant visual object recognition in the primate visual system. It is a four-layer feedforward network with convergence to each part of a layer from a small region of the preceding layer, with competition between the neurons within a layer and with a trace learning rule to help it learn transform invariance. The trace rule is a modified Hebbian rule, which modifies synaptic weights according to both the current firing rates and the firing rates to recently seen stimuli. This enables neurons to learn to respond similarly to the gradually transforming inputs it receives, which over the short term are likely to be about the same object, given the statistics of normal visual inputs. First, we introduce for VisNet2 both single-neuron and multiple-neuron information-theoretic measures of its ability to respond to transformed stimuli. Second, using these measures, we show that quantitatively resetting the trace between stimuli is not necessary for good performance. Third, it is shown that the sigmoid activation functions used in VisNet2, which allow the sparseness of the representation to be controlled, allow good performance when using sparse distributed representations. Fourth, it is shown that VisNet2 operates well with medium-range lateral inhibition with a radius in the same order of size as the region of the preceding layer from which neurons receive inputs. Fifth, in an investigation of different learning rules for learning transform invariance, it is shown that VisNet2 operates better with a trace rule that incorporates in the trace only activity from the preceding presentations of a given stimulus, with no contribution to the trace from the current presentation, and that this is related to temporal difference learning.


Author(s):  
Albert L. Rothenstein

Most biologically-inspired models of object recognition rely on a feed-forward architecture in which abstract representations are gradually built from simple representations, but recognition performance in such systems drops when multiple objects are present in the input. This chapter puts forward the proposal that by using multiple passes of the visual processing hierarchy, both bottom-up and top-down, it is possible to address the limitations of feed-forward architectures and explain the different recognition behaviors that primate vision exhibits. The model relies on the reentrant connections that are ubiquitous in the primate brain to recover spatial information, and thus allow for the selective processing of stimuli. The chapter ends with a discussion of the implications of this work, its explanatory power, and a number of predictions for future experimental work.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Zahra Sadat Shariatmadar ◽  
Karim Faez

Autonomous object recognition in images is one of the most critical topics in security and commercial applications. Due to recent advances in visual neuroscience, the researchers tend to extend biologically plausible schemes to improve the accuracy of object recognition. Preprocessing is one part of the visual recognition system that has received much less attention. In this paper, we propose a new, simple, and biologically inspired pre processing technique by using the data-driven mechanism of visual attention. In this part, the responses of Retinal Ganglion Cells (RGCs) are simulated. After obtaining these responses, an efficient threshold is selected. Then, the points of the raw image with the most information are extracted according to it. Then, the new images with these points are created, and finally, by combining these images with entropy coefficients, the most salient object is located. After extracting appropriate features, the classifier categorizes the initial image into one of the predefined object categories. Our system was evaluated on the Caltech-101 dataset. Experimental results demonstrate the efficacy and effectiveness of this novel method of preprocessing.


Sign in / Sign up

Export Citation Format

Share Document