visual tasks
Recently Published Documents


TOTAL DOCUMENTS

450
(FIVE YEARS 153)

H-INDEX

32
(FIVE YEARS 7)

Author(s):  
Cristina Romero-González ◽  
Ismael García-Varea ◽  
Jesus Martínez-Gómez

AbstractMany of the research problems in robot vision involve the detection of keypoints, areas with salient information in the input images and the generation of local descriptors, that encode relevant information for such keypoints. Computer vision solutions have recently relied on Deep Learning techniques, which make extensive use of the computational capabilities available. In autonomous robots, these capabilities are usually limited and, consequently, images cannot be processed adequately. For this reason, some robot vision tasks still benefit from a more classic approach based on keypoint detectors and local descriptors. In 2D images, the use of binary representations for visual tasks has shown that, with lower computational requirements, they can obtain a performance comparable to classic real-value techniques. However, these achievements have not been fully translated to 3D images, where research is mainly focused on real-value approaches. Thus, in this paper, we propose a keypoint detector and local descriptor based on 3D binary patterns. The experimentation demonstrates that our proposal is competitive against state-of-the-art techniques, while its processing can be performed more efficiently.


2022 ◽  
Vol 12 (1) ◽  
pp. 87
Author(s):  
Conrad Perry ◽  
Heidi Long

This critical review examined current issues to do with the role of visual attention in reading. To do this, we searched for and reviewed 18 recent articles, including all that were found after 2019 and used a Latin alphabet. Inspection of these articles showed that the Visual Attention Span task was run a number of times in well-controlled studies and was typically a small but significant predictor of reading ability, even after potential covariation with phonological effects were accounted for. A number of other types of tasks were used to examine different aspects of visual attention, with differences between dyslexic readers and controls typically found. However, most of these studies did not adequately control for phonological effects, and of those that did, only very weak and non-significant results were found. Furthermore, in the smaller studies, separate within-group correlations between the tasks and reading performance were generally not provided, making causal effects of the manipulations difficult to ascertain. Overall, it seems reasonable to suggest that understanding how and why different types of visual tasks affect particular aspects of reading performance is an important area for future research.


Author(s):  
Vlad Atanasiu ◽  
Isabelle Marthot-Santaniello

AbstractThis article develops theoretical, algorithmic, perceptual, and interaction aspects of script legibility enhancement in the visible light spectrum for the purpose of scholarly editing of papyri texts. Novel legibility enhancement algorithms based on color processing and visual illusions are compared to classic methods in a user experience experiment. (1) The proposed methods outperformed the comparison methods. (2) Users exhibited a broad behavioral spectrum, under the influence of factors such as personality and social conditioning, tasks and application domains, expertise level and image quality, and affordances of software, hardware, and interfaces. No single enhancement method satisfied all factor configurations. Therefore, it is suggested to offer users a broad choice of methods to facilitate personalization, contextualization, and complementarity. (3) A distinction is made between casual and critical vision on the basis of signal ambiguity and error consequences. The criteria of a paradigm for enhancing images for critical applications comprise: interpreting images skeptically; approaching enhancement as a system problem; considering all image structures as potential information; and making uncertainty and alternative interpretations explicit, both visually and numerically.


2021 ◽  
Vol 15 ◽  
Author(s):  
Yajun Zhou ◽  
Li Hu ◽  
Tianyou Yu ◽  
Yuanqing Li

Covert attention aids us in monitoring the environment and optimizing performance in visual tasks. Past behavioral studies have shown that covert attention can enhance spatial resolution. However, electroencephalography (EEG) activity related to neural processing between central and peripheral vision has not been systematically investigated. Here, we conducted an EEG study with 25 subjects who performed covert attentional tasks at different retinal eccentricities ranging from 0.75° to 13.90°, as well as tasks involving overt attention and no attention. EEG signals were recorded with a single stimulus frequency to evoke steady-state visual evoked potentials (SSVEPs) for attention evaluation. We found that the SSVEP response in fixating at the attended location was generally negatively correlated with stimulus eccentricity as characterized by Euclidean distance or horizontal and vertical distance. Moreover, more pronounced characteristics of SSVEP analysis were also acquired in overt attention than in covert attention. Furthermore, offline classification of overt attention, covert attention, and no attention yielded an average accuracy of 91.42%. This work contributes to our understanding of the SSVEP representation of attention in humans and may also lead to brain-computer interfaces (BCIs) that allow people to communicate with choices simply by shifting their attention to them.


2021 ◽  
Author(s):  
Georgy Boos ◽  
Vladimir Budak ◽  
Ekaterina Ilyina ◽  
Tatyana Meshkova

Currently, programs for lighting calculation based on computer graphics (CG) allows us to move to a fundamentally new approach to a assessing the quality of lighting. A designing based on illuminance can be complement with designing based on synthetic images or on lighting design. Modern CG programs can calculate the spatial-angular distribution of luminance (SADL). However, to make the assessment of the quality of lighting using SADL a new criterion is needed. This paper considers constructing a physiological model of the visual perception scale based on experimental data and on neural network for simple scene as light source viewed on a uniform background with different luminance levels. Scale based on threshold contrasts of luminance for each sensation can be a fundament of new criterion. The article offers the method of construction the map for each sensation using the example of «discomfort» and «unpleasant» maps that can easily applied in programs.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Marije ter Wal ◽  
Juan Linde-Domingo ◽  
Julia Lifanov ◽  
Frédéric Roux ◽  
Luca D. Kolibius ◽  
...  

AbstractMemory formation and reinstatement are thought to lock to the hippocampal theta rhythm, predicting that encoding and retrieval processes appear rhythmic themselves. Here, we show that rhythmicity can be observed in behavioral responses from memory tasks, where participants indicate, using button presses, the timing of encoding and recall of cue-object associative memories. We find no evidence for rhythmicity in button presses for visual tasks using the same stimuli, or for questions about already retrieved objects. The oscillations for correctly remembered trials center in the slow theta frequency range (1-5 Hz). Using intracranial EEG recordings, we show that the memory task induces temporally extended phase consistency in hippocampal local field potentials at slow theta frequencies, but significantly more for remembered than forgotten trials, providing a potential mechanistic underpinning for the theta oscillations found in behavioral responses.


2021 ◽  
Vol 4 ◽  
Author(s):  
Alessandro Betti ◽  
Giuseppe Boccignone ◽  
Lapo Faggi ◽  
Marco Gori ◽  
Stefano Melacci

Symmetries, invariances and conservation equations have always been an invaluable guide in Science to model natural phenomena through simple yet effective relations. For instance, in computer vision, translation equivariance is typically a built-in property of neural architectures that are used to solve visual tasks; networks with computational layers implementing such a property are known as Convolutional Neural Networks (CNNs). This kind of mathematical symmetry, as well as many others that have been recently studied, are typically generated by some underlying group of transformations (translations in the case of CNNs, rotations, etc.) and are particularly suitable to process highly structured data such as molecules or chemical compounds which are known to possess those specific symmetries. When dealing with video streams, common built-in equivariances are able to handle only a small fraction of the broad spectrum of transformations encoded in the visual stimulus and, therefore, the corresponding neural architectures have to resort to a huge amount of supervision in order to achieve good generalization capabilities. In the paper we formulate a theory on the development of visual features that is based on the idea that movement itself provides trajectories on which to impose consistency. We introduce the principle of Material Point Invariance which states that each visual feature is invariant with respect to the associated optical flow, so that features and corresponding velocities are an indissoluble pair. Then, we discuss the interaction of features and velocities and show that certain motion invariance traits could be regarded as a generalization of the classical concept of affordance. These analyses of feature-velocity interactions and their invariance properties leads to a visual field theory which expresses the dynamical constraints of motion coherence and might lead to discover the joint evolution of the visual features along with the associated optical flows.


Micromachines ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1458
Author(s):  
Yanpeng Sun ◽  
Zhanyou Chang ◽  
Yong Zhao ◽  
Zhengxu Hua ◽  
Sirui Li

At night, visual quality is reduced due to insufficient illumination so that it is difficult to conduct high-level visual tasks effectively. Existing image enhancement methods only focus on brightness improvement, however, improving image quality in low-light environments still remains a challenging task. In order to overcome the limitations of existing enhancement algorithms with insufficient enhancement, a progressive two-stage image enhancement network is proposed in this paper. The low-light image enhancement problem is innovatively divided into two stages. The first stage of the network extracts the multi-scale features of the image through an encoder and decoder structure. The second stage of the network refines the results after enhancement to further improve output brightness. Experimental results and data analysis show that our method can achieve state-of-the-art performance on synthetic and real data sets, with both subjective and objective capability superior to other approaches.


eLife ◽  
2021 ◽  
Vol 10 ◽  
Author(s):  
Georgin Jacob ◽  
Harish Katti ◽  
Thomas Cherian ◽  
Jhilik Das ◽  
KA Zhivago ◽  
...  

Macaque monkeys are widely used to study vision. In the traditional approach, monkeys are brought into a lab to perform visual tasks while they are restrained to obtain stable eye tracking and neural recordings. Here, we describe a novel environment to study visual cognition in a more natural setting as well as other natural and social behaviors. We designed a naturalistic environment with an integrated touchscreen workstation that enables high-quality eye tracking in unrestrained monkeys. We used this environment to train monkeys on a challenging same-different task. We also show that this environment can reveal interesting novel social behaviors. As proof of concept, we show that two naïve monkeys were able to learn this complex task through a combination of socially observing trained monkeys and through solo trial-and-error. We propose that such naturalistic environments can be used to rigorously study visual cognition as well as other natural and social behaviors in freely moving monkeys.


2021 ◽  
Author(s):  
◽  
Ibrahim Mohammad Hussain Rahman

<p>The human visual attention system (HVA) encompasses a set of interconnected neurological modules that are responsible for analyzing visual stimuli by attending to those regions that are salient. Two contrasting biological mechanisms exist in the HVA systems; bottom-up, data-driven attention and top-down, task-driven attention. The former is mostly responsible for low-level instinctive behaviors, while the latter is responsible for performing complex visual tasks such as target object detection.  Very few computational models have been proposed to model top-down attention, mainly due to three reasons. The first is that the functionality of top-down process involves many influential factors. The second reason is that there is a diversity in top-down responses from task to task. Finally, many biological aspects of the top-down process are not well understood yet.  For the above reasons, it is difficult to come up with a generalized top-down model that could be applied to all high level visual tasks. Instead, this thesis addresses some outstanding issues in modelling top-down attention for one particular task, target object detection. Target object detection is an essential step for analyzing images to further perform complex visual tasks. Target object detection has not been investigated thoroughly when modelling top-down saliency and hence, constitutes the may domain application for this thesis.  The thesis will investigate methods to model top-down attention through various high-level data acquired from images. Furthermore, the thesis will investigate different strategies to dynamically combine bottom-up and top-down processes to improve the detection accuracy, as well as the computational efficiency of the existing and new visual attention models. The following techniques and approaches are proposed to address the outstanding issues in modelling top-down saliency:  1. A top-down saliency model that weights low-level attentional features through contextual knowledge of a scene. The proposed model assigns weights to features of a novel image by extracting a contextual descriptor of the image. The contextual descriptor plays the role of tuning the weighting of low-level features to maximize detection accuracy. By incorporating context into the feature weighting mechanism we improve the quality of the assigned weights to these features.  2. Two modules of target features combined with contextual weighting to improve detection accuracy of the target object. In this proposed model, two sets of attentional feature weights are learned, one through context and the other through target features. When both sources of knowledge are used to model top-down attention, a drastic increase in detection accuracy is achieved in images with complex backgrounds and a variety of target objects.  3. A top-down and bottom-up attention combination model based on feature interaction. This model provides a dynamic way for combining both processes by formulating the problem as feature selection. The feature selection exploits the interaction between these features, yielding a robust set of features that would maximize both the detection accuracy and the overall efficiency of the system.  4. A feature map quality score estimation model that is able to accurately predict the detection accuracy score of any previously novel feature map without the need of groundtruth data. The model extracts various local, global, geometrical and statistical characteristic features from a feature map. These characteristics guide a regression model to estimate the quality of a novel map.  5. A dynamic feature integration framework for combining bottom-up and top-down saliencies at runtime. If the estimation model is able to predict the quality score of any novel feature map accurately, then it is possible to perform dynamic feature map integration based on the estimated value. We propose two frameworks for feature map integration using the estimation model. The proposed integration framework achieves higher human fixation prediction accuracy with minimum number of feature maps than that achieved by combining all feature maps.  The proposed works in this thesis provide new directions in modelling top-down saliency for target object detection. In addition, dynamic approaches for top-down and bottom-up combination show considerable improvements over existing approaches in both efficiency and accuracy.</p>


Sign in / Sign up

Export Citation Format

Share Document