2D and 3D Visual Attention for Computer Vision

3D Printing ◽  
2017 ◽  
pp. 75-118
Author(s):  
Vincent Ricordel ◽  
Junle Wang ◽  
Matthieu Perreira Da Silva ◽  
Patrick Le Callet

Visual attention is one of the most important mechanisms deployed in the human visual system (HVS) to reduce the amount of information that our brain needs to process. An increasing amount of efforts has been dedicated to the study of visual attention, and this chapter proposes to clarify the advances achieved in computational modeling of visual attention. First the concepts of visual attention, including the links between visual salience and visual importance, are detailed. The main characteristics of the HVS involved in the process of visual perception are also explained. Next we focus on eye-tracking, because of its role in the evaluation of the performance of the models. A complete state of the art in computational modeling of visual attention is then presented. The research works that extend some visual attention models to 3D by taking into account of the impact of depth perception are finally explained and compared.

Author(s):  
Vincent Ricordel ◽  
Junle Wang ◽  
Matthieu Perreira Da Silva ◽  
Patrick Le Callet

Visual attention is one of the most important mechanisms deployed in the human visual system (HVS) to reduce the amount of information that our brain needs to process. An increasing amount of efforts has been dedicated to the study of visual attention, and this chapter proposes to clarify the advances achieved in computational modeling of visual attention. First the concepts of visual attention, including the links between visual salience and visual importance, are detailed. The main characteristics of the HVS involved in the process of visual perception are also explained. Next we focus on eye-tracking, because of its role in the evaluation of the performance of the models. A complete state of the art in computational modeling of visual attention is then presented. The research works that extend some visual attention models to 3D by taking into account of the impact of depth perception are finally explained and compared.


Author(s):  
Wen-Han Zhu ◽  
Wei Sun ◽  
Xiong-Kuo Min ◽  
Guang-Tao Zhai ◽  
Xiao-Kang Yang

AbstractObjective image quality assessment (IQA) plays an important role in various visual communication systems, which can automatically and efficiently predict the perceived quality of images. The human eye is the ultimate evaluator for visual experience, thus the modeling of human visual system (HVS) is a core issue for objective IQA and visual experience optimization. The traditional model based on black box fitting has low interpretability and it is difficult to guide the experience optimization effectively, while the model based on physiological simulation is hard to integrate into practical visual communication services due to its high computational complexity. For bridging the gap between signal distortion and visual experience, in this paper, we propose a novel perceptual no-reference (NR) IQA algorithm based on structural computational modeling of HVS. According to the mechanism of the human brain, we divide the visual signal processing into a low-level visual layer, a middle-level visual layer and a high-level visual layer, which conduct pixel information processing, primitive information processing and global image information processing, respectively. The natural scene statistics (NSS) based features, deep features and free-energy based features are extracted from these three layers. The support vector regression (SVR) is employed to aggregate features to the final quality prediction. Extensive experimental comparisons on three widely used benchmark IQA databases (LIVE, CSIQ and TID2013) demonstrate that our proposed metric is highly competitive with or outperforms the state-of-the-art NR IQA measures.


2016 ◽  
Vol 24 (1) ◽  
pp. 143-182 ◽  
Author(s):  
Harith Al-Sahaf ◽  
Mengjie Zhang ◽  
Mark Johnston

In the computer vision and pattern recognition fields, image classification represents an important yet difficult task. It is a challenge to build effective computer models to replicate the remarkable ability of the human visual system, which relies on only one or a few instances to learn a completely new class or an object of a class. Recently we proposed two genetic programming (GP) methods, one-shot GP and compound-GP, that aim to evolve a program for the task of binary classification in images. The two methods are designed to use only one or a few instances per class to evolve the model. In this study, we investigate these two methods in terms of performance, robustness, and complexity of the evolved programs. We use ten data sets that vary in difficulty to evaluate these two methods. We also compare them with two other GP and six non-GP methods. The results show that one-shot GP and compound-GP outperform or achieve results comparable to competitor methods. Moreover, the features extracted by these two methods improve the performance of other classifiers with handcrafted features and those extracted by a recently developed GP-based method in most cases.


Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 132-132
Author(s):  
S Edelman ◽  
S Duvdevani-Bar

To recognise a previously seen object, the visual system must overcome the variability in the object's appearance caused by factors such as illumination and pose. It is possible to counter the influence of these factors, by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Routine visual tasks, however, typically require not so much recognition as categorisation, that is making sense of objects not seen before. Despite persistent practical difficulties, theorists in computer vision and visual perception traditionally favour the structural route to categorisation, according to which forming a description of a novel shape in terms of its parts and their spatial relationships is a prerequisite to the ability to categorise it. In comparison, we demonstrate that knowledge of instances of each of several representative categories can provide the necessary computational substrate for the categorisation of their new instances, as well as for representation and processing of radically novel shapes, not belonging to any of the familiar categories. The representational scheme underlying this approach, according to which objects are encoded by their similarities to entire reference shapes (S Edelman, 1997 Behavioral and Brain Sciences in press), is computationally viable, and is readily mapped onto the mechanisms of biological vision revealed by recent psychophysical and physiological studies.


Author(s):  
N. B. Behosh ◽  
I. B. Chornomydz ◽  
O. Ya. Zyatkovska

The article adduces the various aspects of the impact of a computer monitor on the functioning of the human visual system. A significant flow of information which daily receives visual apparatus person with computer screens accompanied not only asthenopia but also objective changes of the visual system. There was analysed the visual features and the factors that determined the occurrence of changes in the refractive computer users.


Algorithms ◽  
2020 ◽  
Vol 13 (7) ◽  
pp. 167 ◽  
Author(s):  
Dan Malowany ◽  
Hugo Guterman

Computer vision is currently one of the most exciting and rapidly evolving fields of science, which affects numerous industries. Research and development breakthroughs, mainly in the field of convolutional neural networks (CNNs), opened the way to unprecedented sensitivity and precision in object detection and recognition tasks. Nevertheless, the findings in recent years on the sensitivity of neural networks to additive noise, light conditions, and to the wholeness of the training dataset, indicate that this technology still lacks the robustness needed for the autonomous robotic industry. In an attempt to bring computer vision algorithms closer to the capabilities of a human operator, the mechanisms of the human visual system was analyzed in this work. Recent studies show that the mechanisms behind the recognition process in the human brain include continuous generation of predictions based on prior knowledge of the world. These predictions enable rapid generation of contextual hypotheses that bias the outcome of the recognition process. This mechanism is especially advantageous in situations of uncertainty, when visual input is ambiguous. In addition, the human visual system continuously updates its knowledge about the world based on the gaps between its prediction and the visual feedback. CNNs are feed forward in nature and lack such top-down contextual attenuation mechanisms. As a result, although they process massive amounts of visual information during their operation, the information is not transformed into knowledge that can be used to generate contextual predictions and improve their performance. In this work, an architecture was designed that aims to integrate the concepts behind the top-down prediction and learning processes of the human visual system with the state-of-the-art bottom-up object recognition models, e.g., deep CNNs. The work focuses on two mechanisms of the human visual system: anticipation-driven perception and reinforcement-driven learning. Imitating these top-down mechanisms, together with the state-of-the-art bottom-up feed-forward algorithms, resulted in an accurate, robust, and continuously improving target recognition model.


Author(s):  
Rajarshi Pal

Even the enormous processing capacity of the human brain is not enough to handle all the visual sensory information that falls upon the retina. Still human beings can efficiently respond to the external stimuli. Selective attention plays an important role here. It helps to select only the pertinent portions of the scene being viewed for further processing at the deeper brain. Computational modeling of this neuro-psychological phenomenon has the potential to enrich many computer vision tasks. Enormous amounts of research involving psychovisual experiments and computational models of attention have been and are being carried out all within the past few decades. This article compiles a good volume of these research efforts. It also discusses various aspects related to computational modeling of attention–such as, choice of features, evaluation of these models, and so forth.


2011 ◽  
Vol 82 (3) ◽  
pp. 299-309 ◽  
Author(s):  
Javier Silvestre-Blanes ◽  
Joaquin Berenguer-Sebastiá ◽  
Rubén Pérez-Lloréns ◽  
Ignacio Miralles ◽  
Jorge Moreno

The measurement and evaluation of the appearance of wrinkling in textile products after domestic washing and drying is performed currently by the comparison of the fabric with the replicas. This kind of evaluation has certain drawbacks, the most significant of which are its subjectivity and its limitations when used with garments. In this paper, we present an automated wrinkling evaluation system. The system developed can process fabrics as well as any type of garment, independent of size or pattern on the material. The system allows us to label different parts of the garment. Thus, as different garment parts have different influence on human perception, this labeling enables the use of weighting, to improve the correlation with the human visual system. The system has been tested with different garments showing good performance and correlation with human perception.


Perception ◽  
1982 ◽  
Vol 11 (3) ◽  
pp. 337-346 ◽  
Author(s):  
Leon N Piotrowski ◽  
Fergus W Campbell

To establish how little information the human visual system requires for recognition, common objects were digitally manipulated in the Fourier domain. The results demonstrate that it is not only possible, but also quite efficient, for a (biological) visual system to exist with very few phase relationships among the component spatial frequencies of the (retinal) image. A visual example is then presented which illustrates how certain phase relationships can hinder, or completely eliminate, the recognition of visual scenes.


Sign in / Sign up

Export Citation Format

Share Document