Biological modeling of human visual system for object recognition using GLoP filters and sparse coding on multi-manifolds

2018 ◽  
Vol 29 (6) ◽  
pp. 965-977 ◽  
Author(s):  
Limiao Deng ◽  
Yanjiang Wang ◽  
Baodi Liu ◽  
Weifeng Liu ◽  
Yujuan Qi
Algorithms ◽  
2020 ◽  
Vol 13 (7) ◽  
pp. 167 ◽  
Author(s):  
Dan Malowany ◽  
Hugo Guterman

Computer vision is currently one of the most exciting and rapidly evolving fields of science, which affects numerous industries. Research and development breakthroughs, mainly in the field of convolutional neural networks (CNNs), opened the way to unprecedented sensitivity and precision in object detection and recognition tasks. Nevertheless, the findings in recent years on the sensitivity of neural networks to additive noise, light conditions, and to the wholeness of the training dataset, indicate that this technology still lacks the robustness needed for the autonomous robotic industry. In an attempt to bring computer vision algorithms closer to the capabilities of a human operator, the mechanisms of the human visual system was analyzed in this work. Recent studies show that the mechanisms behind the recognition process in the human brain include continuous generation of predictions based on prior knowledge of the world. These predictions enable rapid generation of contextual hypotheses that bias the outcome of the recognition process. This mechanism is especially advantageous in situations of uncertainty, when visual input is ambiguous. In addition, the human visual system continuously updates its knowledge about the world based on the gaps between its prediction and the visual feedback. CNNs are feed forward in nature and lack such top-down contextual attenuation mechanisms. As a result, although they process massive amounts of visual information during their operation, the information is not transformed into knowledge that can be used to generate contextual predictions and improve their performance. In this work, an architecture was designed that aims to integrate the concepts behind the top-down prediction and learning processes of the human visual system with the state-of-the-art bottom-up object recognition models, e.g., deep CNNs. The work focuses on two mechanisms of the human visual system: anticipation-driven perception and reinforcement-driven learning. Imitating these top-down mechanisms, together with the state-of-the-art bottom-up feed-forward algorithms, resulted in an accurate, robust, and continuously improving target recognition model.


Author(s):  
Mohammadesmaeil Akbarpour ◽  
Nasser Mehrshad ◽  
Seyyed-Mohammad Razavi

<p><span>Human recognize objects in complex natural images very fast within a fraction of a second. Many computational object recognition models inspired from this powerful ability of human. The Human Visual System (HVS) recognizes object in several processing layers which we know them as hierarchically model. Due to amazing complexity of HVS and the connections in visual pathway, computational modeling of HVS directly from its physiology is not possible. So it considered as a some blocks and each block modeled separately. One models inspiring of HVS is HMAX which its main problem is selecting patches in random way. As HMAX is a hierarchical model, HMAX can enhanced with enhancing each layer separately. In this paper instead of random patch extraction, Desirable Patches for HMAX (DPHMAX) will extracted.  HVS for extracting patch first selected patches with more information. For simulating this block patches with more variance will be selected. Then HVS will chose patches with more similarity in a class. For simulating this block one algorithm is used. For evaluating proposed method, Caltech 5 and Caltech101 datasets are used. Results show that the proposed method (DPMAX) provides a significant performance over HMAX and other models with the same framework.</span></p>


Author(s):  
Yaghoub Pourasad

<p>Identify objects based on modeling the human visual system, as an effective method in intelligent identification, has attracted the attention of many researchers. Although the machines have high computational speed but are very weak as compared to humans in terms of diagnosis. Experience has shown that in many areas of image processing, algorithms that have biological backing had more simplicity and better performance. The human visual system, first select the main parts of the image which is provided by the visual featured model, then pays to object recognition which is a hierarchical operations according to this, HMAX model is also provided. HMAX object recognition model from the group of hierarchical models without feedback that its structure and parameters selected based on biological characteristics of the visual cortex. This model is a hierarchical model neural network with four layers, is composed of alternating layers that are simple and complex. Due to the high complexity of the human visual system is virtually impossible to replicate it. For each of the above, separate models have been proposed but in the human visual system, this operation is performed seamlessly, thus, by combining the principles of these models is expected to be closer to the human visual system and obtain a higher recognition rate. In this paper, we introduce an architecture to classify images based on a combination of previous work is based on the basic operation of the visual cortex. According to the results presented, the proposed model compared with the main HMAX model has a much higher recognition rate. Simulations was performed on the database of Caltech101.</p>


1998 ◽  
Vol 21 (1) ◽  
pp. 36-37 ◽  
Author(s):  
Manish Singh ◽  
Barbara Landau

Converging psychophysical evidence suggests that the human visual system parses shapes into component parts for the purposes of object recognition. We examine the Schyns et al. claim of “creation” of features in light of recent work on part-based representations of visual shape, particularly the perceptual rules that human vision uses to parse shapes.


2020 ◽  
Vol 10 (12) ◽  
pp. 4395
Author(s):  
Jongsu Yoon ◽  
Yoonsik Choe

Retinex theory represents the human visual system by showing the relative reflectance of an object under various illumination conditions. A feature of this human visual system is color constancy, and the Retinex theory is designed in consideration of this feature. The Retinex algorithms have been popularly used to effectively decompose the illumination and reflectance of an object. The main aim of this paper is to study image enhancement using convolution sparse coding and sparse representations of the reflectance component in the Retinex model over a learned dictionary. To realize this, we use the convolutional sparse coding model to represent the reflectance component in detail. In addition, we propose that the reflectance component can be reconstructed using a trained general dictionary by using convolutional sparse coding from a large dataset. We use singular value decomposition in limited memory to construct a best reflectance dictionary. This allows the reflectance component to provide improved visual quality over conventional methods, as shown in the experimental results. Consequently, we can reduce the difference in perception between humans and machines through the proposed Retinex-based image enhancement.


Perception ◽  
1994 ◽  
Vol 23 (5) ◽  
pp. 547-561 ◽  
Author(s):  
Luc J Van Gool ◽  
Theo Moons ◽  
Eric Pauwels ◽  
Johan Wagemans

It is remarkable how well the human visual system can cope with changing viewpoints when it comes to recognising shapes. The state of the art in machine vision is still quite remote from solving such tasks. Nevertheless, a surge in invariance-based research has led to the development of methods for solving recognition problems still considered hard until recently. A nonmathematical account explains the basic philosophy and trade-offs underlying this strand of research. The principles are explained for the relatively simple case of planar-object recognition under arbitrary viewpoints. Well-known Euclidean concepts form the basis of invariance in this case. Introducing constraints in addition to that of planarity may further simplify the invariants. On the other hand, there are problems for which no invariants exist.


2014 ◽  
Vol 111 (1) ◽  
pp. 91-102 ◽  
Author(s):  
Leyla Isik ◽  
Ethan M. Meyers ◽  
Joel Z. Leibo ◽  
Tomaso Poggio

The human visual system can rapidly recognize objects despite transformations that alter their appearance. The precise timing of when the brain computes neural representations that are invariant to particular transformations, however, has not been mapped in humans. Here we employ magnetoencephalography decoding analysis to measure the dynamics of size- and position-invariant visual information development in the ventral visual stream. With this method we can read out the identity of objects beginning as early as 60 ms. Size- and position-invariant visual information appear around 125 ms and 150 ms, respectively, and both develop in stages, with invariance to smaller transformations arising before invariance to larger transformations. Additionally, the magnetoencephalography sensor activity localizes to neural sources that are in the most posterior occipital regions at the early decoding times and then move temporally as invariant information develops. These results provide previously unknown latencies for key stages of human-invariant object recognition, as well as new and compelling evidence for a feed-forward hierarchical model of invariant object recognition where invariance increases at each successive visual area along the ventral stream.


Sign in / Sign up

Export Citation Format

Share Document