scholarly journals Transfer of Perceptual Learning From Local Stereopsis to Global Stereopsis in Adults With Amblyopia: A Preliminary Study

2021 ◽  
Vol 15 ◽  
Author(s):  
Adrien Chopin ◽  
Michael A. Silver ◽  
Yasha Sheynin ◽  
Jian Ding ◽  
Dennis Michael Levi

It has long been debated whether the analysis of global and local stereoscopic depth is performed by a single system or by separate systems. Global stereopsis requires the visual system to solve a complex binocular matching problem to obtain a coherent percept of depth. In contrast, local stereopsis requires only a simple matching of similar image features. In this preliminary study, we recruited five adults with amblyopia who lacked global stereopsis and trained them on a computerized local stereopsis depth task for an average of 12 h. Three out of five (60%) participants recovered fine global stereoscopic vision through training. Those who recovered global stereopsis reached a learning plateau more quickly on the local stereopsis task, and they tended to start the training with better initial local stereopsis performance, to improve more on local stereopsis with training, and to have less severe amblyopia. The transfer of learning from local stereopsis to global stereopsis is compatible with an interacting two-stage model.

Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5136
Author(s):  
Xiaoxin Fang ◽  
Qiwu Luo ◽  
Bingxing Zhou ◽  
Congcong Li ◽  
Lu Tian

The computer-vision-based surface defect detection of metal planar materials is a research hotspot in the field of metallurgical industry. The high standard of planar surface quality in the metal manufacturing industry requires that the performance of an automated visual inspection system and its algorithms are constantly improved. This paper attempts to present a comprehensive survey on both two-dimensional and three-dimensional surface defect detection technologies based on reviewing over 160 publications for some typical metal planar material products of steel, aluminum, copper plates and strips. According to the algorithm properties as well as the image features, the existing two-dimensional methodologies are categorized into four groups: statistical, spectral, model, and machine learning-based methods. On the basis of three-dimensional data acquisition, the three-dimensional technologies are divided into stereoscopic vision, photometric stereo, laser scanner, and structured light measurement methods. These classical algorithms and emerging methods are introduced, analyzed, and compared in this review. Finally, the remaining challenges and future research trends of visual defect detection are discussed and forecasted at an abstract level.


2009 ◽  
Vol 26 (1) ◽  
pp. 81-92 ◽  
Author(s):  
CONSTANTIN A. ROTHKOPF ◽  
DANA H. BALLARD

AbstractTheories of efficient sensory processing have considered the regularities of image properties due to the structure of the environment in order to explain properties of neuronal representations of the visual world. The regularities imposed on the input to the visual system due to the regularities of the active selection process mediated by the voluntary movements of the eyes have been considered to a much lesser degree. This is surprising, given that the active nature of vision is well established. The present article investigates statistics of image features at the center of gaze of human subjects navigating through a virtual environment and avoiding and approaching different objects. The analysis shows that contrast can be significantly higher or lower at fixation location compared to random locations, depending on whether subjects avoid or approach targets. Similarly, significant differences in the distribution of responses of model simple and complex cells between horizontal and vertical orientations are found over timescales of tens of seconds. By clustering the model simple cell responses, it is established that gaze was directed toward three distinct features of intermediate complexity the vast majority of time. Thus, this study demonstrates and quantifies how the visuomotor tasks of approaching and avoiding objects during navigation determine feature statistics of the input to the visual system through the combined influence on body and eye movements.


2018 ◽  
Author(s):  
Yueyang Xu ◽  
Ashish Raj ◽  
Jonathan Victor ◽  

AbstractAn important heuristic in developing image processing technologies is to mimic the computational strategies used by humans. Relevant to this, recent studies have shown that the human brain’s processing strategy is closely matched to the characteristics of natural scenes, both in terms of global and local image statistics. However, structural MRI images and natural scenes have fundamental differences: the former are two-dimensional sections through a volume, the latter are projections. MRI image formation is also radically different from natural image formation, involving acquisition in Fourier space, followed by several filtering and processing steps that all have the potential to alter image statistics. As a consequence, aspects of the human visual system that are finely-tuned to processing natural scenes may not be equally well-suited for MRI images, and identification of the differences between MRI images and natural scenes may lead to improved machine analysis of MRI.With these considerations in mind, we analyzed spectra and local image statistics of MRI images in several databases including T1 and FLAIR sequence types and of simulated MRI images,[1]–[6] and compared this analysis to a parallel analysis of natural images[7] and visual sensitivity[7][8]. We found substantial differences between the statistical features of MRI images and natural images. Power spectra of MRI images had a steeper slope than that of natural images, indicating a lack of scale invariance. Independent of this, local image statistics of MRI and natural images differed: compared to natural images, MRI images had smaller variations in their local two-point statistics and larger variations in their local three-point statistics – to which the human visual system is relatively insensitive. Our findings were consistent across MRI databases and simulated MRI images, suggesting that they result from brain geometry at the scale of MRI resolution, rather than characteristics of specific imaging and reconstruction methods.


Perception ◽  
1996 ◽  
Vol 25 (1_suppl) ◽  
pp. 61-61
Author(s):  
A Grigo ◽  
M Lappe

We investigated the influence of stereoscopic vision on the perception of optic flow fields in psychophysical experiments based on the effect of an illusory transformation found by Duffy and Wurtz (1993 Vision Research33 1481 – 1490). Human subjects are not able to determine the centre of an expanding optic flow field correctly if the expansion is transparently superimposed on a unidirectional motion pattern. Its location is rather perceived shifted in the direction of the translational movement. Duffy and Wurtz proposed that this illusory shift is caused by the visual system taking the presented flow pattern as a flow field composed of linear self-motion and an eye rotation. As a consequence, the centre of the expansional movement is determined by compensating for the simulated eye rotation, like determining one's direction of heading (Lappe and Rauschecker, 1994 Vision Research35 1619 – 1631). In our experiments we examined the dependence of the illusory transformation on differences in depth between the superimposed movements. We presented the expansional and translational stimuli with different relative binocular disparities. In the case of zero disparity, we could confirm the results of Duffy and Wurtz. For uncrossed disparities (ie translation behind expansion) we found a small and nonsignificant decrease of the illusory shift. In contrast, there was a strong decrease up to 80% in the case of crossed disparity (ie translation in front of expansion). These findings confirm the assumption that the motion pattern is interpreted as a self-motion flow field: only in the unrealistic case of a large rotational component present in front of an expansion are the superimposed movements interpreted separately by the visual system.


Perception ◽  
1972 ◽  
Vol 1 (4) ◽  
pp. 483-490 ◽  
Author(s):  
J A Movshon ◽  
B E I Chambers ◽  
C Blakemore

Interocular transfer of the tilt aftereffect was investigated in normal humans with good stereopsis and in subjects without stereoscopic vision. These latter subjects were divided into two groups: those with and those without a history of strabismus. Strabismic subjects showed grossly reduced interocular transfer of the effect (12% mean transfer). Nonstrabismic subjects had moderate transfer (49%) and normal subjects showed approximately 70% mean transfer. All normal subjects showed greater transfer from the dominant eye to the nondominant than vice versa. The results are discussed with respect to developmental effects in the visual system of cats and humans, and the nature of the tilt aftereffect.


Author(s):  
Marinella Cadoni ◽  
Andrea Lagorio ◽  
Souad Khellat-Kihel ◽  
Enrico Grosso

AbstractTraditional local image descriptors such as SIFT and SURF are based on processings similar to those that take place in the early visual cortex. Nowadays, convolutional neural networks still draw inspiration from the human vision system, integrating computational elements typical of higher visual cortical areas. Deep CNN’s architectures are intrinsically hard to interpret, so much effort has been made to dissect them in order to understand which type of features they learn. However, considering the resemblance to the human vision system, no enough attention has been devoted to understand if the image features learned by deep CNNs and used for classification correlate with features that humans select when viewing images, the so-called human fixations, nor if they correlate with earlier developed handcrafted features such as SIFT and SURF. Exploring these correlations is highly meaningful since what we require from CNNs, and features in general, is to recognize and correctly classify objects or subjects relevant to humans. In this paper, we establish the correlation between three families of image interest points: human fixations, handcrafted and CNN features. We extract features from the feature maps of selected layers of several deep CNN’s architectures, from the shallowest to the deepest. All features and fixations are then compared with two types of measures, global and local, which unveil the degree of similarity of the areas of interest of the three families. From the experiments carried out on ETD human fixations database, it turns out that human fixations are positively correlated with handcrafted features and even more with deep layers of CNNs and that handcrafted features highly correlate between themselves as some CNNs do.


Sensors ◽  
2021 ◽  
Vol 22 (1) ◽  
pp. 285
Author(s):  
Jianqiang Xu ◽  
Haoyu Zhao ◽  
Weidong Min

An important area in a gathering place is a region attracting the constant attention of people and has evident visual features, such as a flexible stage or an open-air show. Finding such areas can help security supervisors locate the abnormal regions automatically. The existing related methods lack an efficient means to find important area candidates from a scene and have failed to judge whether or not a candidate attracts people’s attention. To realize the detection of an important area, this study proposes a two-stage method with a novel multi-input attention network (MAN). The first stage, called important area candidate generation, aims to generate candidate important areas with an image-processing algorithm (i.e., K-means++, image dilation, median filtering, and the RLSA algorithm). The candidate areas can be selected automatically for further analysis. The second stage, called important area candidate classification, aims to detect an important area from candidates with MAN. In particular, MAN is designed as a multi-input network structure, which fuses global and local image features to judge whether or not an area attracts people’s attention. To enhance the representation of candidate areas, two modules (i.e., channel attention and spatial attention modules) are proposed on the basis of the attention mechanism. These modules are mainly based on multi-layer perceptron and pooling operation to reconstruct the image feature and provide considerably efficient representation. This study also contributes to a new dataset called gathering place important area detection for testing the proposed two-stage method. Lastly, experimental results show that the proposed method has good performance and can correctly detect an important area.


2021 ◽  
Author(s):  
Ying Bi ◽  
Mengjie Zhang ◽  
Bing Xue

© 2018 IEEE. Feature extraction is an essential process to image classification. Existing feature extraction methods can extract important and discriminative image features but often require domain expert and human intervention. Genetic Programming (GP) can automatically extract features which are more adaptive to different image classification tasks. However, the majority GP-based methods only extract relatively simple features of one type i.e. local or global, which are not effective and efficient for complex image classification. In this paper, a new GP method (GP-GLF) is proposed to achieve automatically and simultaneously global and local feature extraction to image classification. To extract discriminative image features, several effective and well-known feature extraction methods, such as HOG, SIFT and LBP, are employed as GP functions in global and local scenarios. A novel program structure is developed to allow GP-GLF to evolve descriptors that can synthesise feature vectors from the input image and the automatically detected regions using these functions. The performance of the proposed method is evaluated on four different image classification data sets of varying difficulty and compared with seven GP based methods and a set of non-GP methods. Experimental results show that the proposed method achieves significantly better or similar performance than almost all the peer methods. Further analysis on the evolved programs shows the good interpretability of the GP-GLF method.


2008 ◽  
Vol 276 (1658) ◽  
pp. 861-869 ◽  
Author(s):  
Peter Neri

The human visual system is remarkably sensitive to stimuli conveying actions, for example the fighting action between two agents. A central unresolved question is whether each agent is processed as a whole in one stage, or as subparts (e.g. limbs) that are assembled into an agent at a later stage. We measured the perceptual impact of perturbing an agent either by scrambling individual limbs while leaving the relationship between limbs unaffected or conversely by scrambling the relationship between limbs while leaving individual limbs unaffected. Our measurements differed for the two conditions, providing conclusive evidence against a one-stage model. The results were instead consistent with a two-stage processing pathway: an early bottom-up stage where local motion signals are integrated to reconstruct individual limbs (arms and legs), and a subsequent top-down stage where limbs are combined to represent whole agents.


1998 ◽  
Vol 21 (4) ◽  
pp. 467-468
Author(s):  
David R. Andresen ◽  
Chad J. Marsolek

The human visual system is capable of learning both abstract and specific mappings to underlie shape recognition. How could dissimilar shapes be mapped to the same location in visual representation space, yet similar shapes be mapped to different locations? Without fundamental changes, Chorus, like other single-system models, could not accomplish both mappings in a manner that accounts for recent evidence.


Sign in / Sign up

Export Citation Format

Share Document