When scenes speak louder than words: Verbal encoding does not mediate the relationship between scene meaning and visual attention
The complexity of the visual world requires that we constrain visual attention and prioritize some regions of the scene for attention over others. The current study investigated whether verbal encoding processes influence how attention is allocated in scenes. Specifically, we asked whether the advantage of scene meaning over image salience in attentional guidance is modulated by verbal encoding, given that we often use language to process information. Sixty subjects studied 30 scenes for 12 seconds each in preparation for a scene recall task. Thirty of the subjects engaged in a secondary articulatory suppression task (digit repetition) concurrent with scene viewing. Meaning and saliency maps were quantified for each of the 30 scenes. In both conditions we found that meaning explained more of the variance in visual attention than image salience did, particularly when we controlled for the overlap between meaning and salience. Based on these results, verbal encoding processes do not appear to modulate the relationship between scene meaning and visual attention, or to play a role in encoding scenes for later recall. Our findings suggest that semantic information in the scene steers the attentional ship, consistent with cognitive guidance theory.