scholarly journals When scenes speak louder than words: Verbal encoding does not mediate the relationship between scene meaning and visual attention

2019 ◽  
Author(s):  
Gwendolyn L Rehrig ◽  
Taylor Hayes ◽  
John M. Henderson ◽  
Fernanda Ferreira

The complexity of the visual world requires that we constrain visual attention and prioritize some regions of the scene for attention over others. The current study investigated whether verbal encoding processes influence how attention is allocated in scenes. Specifically, we asked whether the advantage of scene meaning over image salience in attentional guidance is modulated by verbal encoding, given that we often use language to process information. Sixty subjects studied 30 scenes for 12 seconds each in preparation for a scene recall task. Thirty of the subjects engaged in a secondary articulatory suppression task (digit repetition) concurrent with scene viewing. Meaning and saliency maps were quantified for each of the 30 scenes. In both conditions we found that meaning explained more of the variance in visual attention than image salience did, particularly when we controlled for the overlap between meaning and salience. Based on these results, verbal encoding processes do not appear to modulate the relationship between scene meaning and visual attention, or to play a role in encoding scenes for later recall. Our findings suggest that semantic information in the scene steers the attentional ship, consistent with cognitive guidance theory.

Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2170 ◽  
Author(s):  
Yuya Moroto ◽  
Keisuke Maeda ◽  
Takahiro Ogawa ◽  
Miki Haseyama

A few-shot personalized saliency prediction based on adaptive image selection considering object and visual attention is presented in this paper. Since general methods predicting personalized saliency maps (PSMs) need a large number of training images, the establishment of a theory using a small number of training images is needed. To tackle this problem, although finding persons who have visual attention similar to that of a target person is effective, all persons have to commonly gaze at many images. Thus, it becomes difficult and unrealistic when considering their burden. On the other hand, this paper introduces a novel adaptive image selection (AIS) scheme that focuses on the relationship between human visual attention and objects in images. AIS focuses on both a diversity of objects in images and a variance of PSMs for the objects. Specifically, AIS selects images so that selected images have various kinds of objects to maintain their diversity. Moreover, AIS guarantees the high variance of PSMs for persons since it represents the regions that many persons commonly gaze at or do not gaze at. The proposed method enables selecting similar users from a small number of images by selecting images that have high diversities and variances. This is the technical contribution of this paper. Experimental results show the effectiveness of our personalized saliency prediction including the new image selection scheme.


2021 ◽  
Author(s):  
Gwendolyn L Rehrig ◽  
Madison Barker ◽  
Candace Elise Peacock ◽  
Taylor R. Hayes ◽  
John M. Henderson ◽  
...  

As we act on the world around us, our eyes seek out objects we plan to interact with. A growing body of evidence suggests that overt visual attention selects objects in the environment that could be interacted with, even when the task precludes physical interaction. Our previous work showed objects that afford grasping interactions influenced attention when static scenes depicted reachable spaces, and attention was otherwise better explained by general meaning (Rehrig, Peacock, et al., 2021). Because grasping is but one of many object interactions, our previous work may have downplayed the influence of object affordances on attention. The current study investigated the relationship between overt visual attention and object affordances versus broadly construed semantic information in scenes as speakers describe possible actions. In addition to meaning and grasp maps—which capture informativeness and grasping object affordances in scenes, respectively—we introduce interact maps, which capture affordances more broadly. In a mixed-effects analysis of 3 eyetracking experiments, interact map values predicted fixated regions in all experiments, whereas there was no main effect of meaning, and grasp maps marginally predicted fixated locations for scenes that depicted reachable spaces only. Our findings suggest speakers consistently allocate attention to scene regions that could be readily interacted with when describing the possible actions in a scene, while the other variants of semantic information tested (graspability and general meaning) have a compensatory or additive influence on attention. The current study clarifies the importance of object affordances in guiding visual attention in scenes.


Author(s):  
Steven Todd ◽  
Arthur F. Kramer

Earlier research has shown that a task-irrelevant sudden onset of an object will capture or draw an observer's visual attention to that object's location (e.g., Yantis & Jonides, 1984). In the four experiments reported here, we explore the question of whether task-irrelevant properties other than sudden-onset may capture attention. Our results suggest that a uniquely colored or luminous object, as well as an irrelevant boundary, may indeed capture or guide attention, though apparently to a lesser degree than a sudden onset: it appears that the degree of attentional capture is dependent on the relative salience of the varied, irrelevant dimension. Whereas a sudden onset is very salient, a uniquely colored object, for example, is only salient relative to the other objects within view, both to the degree that it is different in hue from its neighbors and the number of neighbors from which it differs. The relationship of these findings to work in the fields of visual momentum and visual scanning is noted.


1976 ◽  
Vol 43 (2) ◽  
pp. 555-561 ◽  
Author(s):  
Richard A. Wyrick ◽  
Vincent J. Tempone ◽  
Jack Capehart

The relationship between attention and incidental learning during discrimination training was studied in 30 children, aged 10 to 11. A polymetric eye-movement recorder measured direct visual attention. Consistent with previous findings, recall of incidental stimuli was greatest during the initial and terminal stages of intentional learning. Contrary to previous explanations, however, visual attention to incidental stimuli was not related to training. While individual differences in attention to incidental stimuli were predictive of recall, attention to incidental stimuli was not related to level of training. Results suggested that changes in higher order information processing rather than direct visual attention were responsible for the curvilinear learning of incidental stimuli during intentional training.


AI ◽  
2020 ◽  
Vol 1 (1) ◽  
pp. 117-140 ◽  
Author(s):  
Wu Hao ◽  
Jiao Menglin ◽  
Tian Guohui ◽  
Ma Qing ◽  
Liu Guoliang

Aiming to solve the problem of environmental information being difficult to characterize when an intelligent service is used, knowledge graphs are used to express environmental information when performing intelligent services. Here, we specially design a kind of knowledge graph for environment expression referred to as a robot knowledge graph (R-KG). The main work of a R-KG is to integrate the diverse semantic information in the environment and pay attention to the relationship at the instance level. Also, through the efficient knowledge organization of a R-KG, robots can fully understand the environment. The R-KG firstly integrates knowledge from different sources to form a unified and standardized representation of a knowledge graph. Then, the deep logical relationship hidden in the knowledge graph is explored. To this end, a knowledge reasoning model based on a Markov logic network is proposed to realize the self-developmental ability of the knowledge graph and to further enrich it. Finally, as the strength of environment expression directly affects the efficiency of robots performing services, in order to verify the efficiency of the R-KG, it is used here as the semantic map that can be directly used by a robot for performing intelligent services. The final results prove that the R-KG can effectively express environmental information.


2009 ◽  
Vol 10 (1) ◽  
pp. 146-151 ◽  
Author(s):  
Daniel Memmert ◽  
Daniel J. Simons ◽  
Thorsten Grimme

Sign in / Sign up

Export Citation Format

Share Document