Neural Correlates of Divided Attention in Natural Scenes

Individuals are able to split attention between separate locations, but divided spatial attention incurs the additional requirement of monitoring multiple streams of information. Here, we investigated divided attention using photos of natural scenes, where the rapid categorization of familiar objects and prior knowledge about the likely positions of objects in the real world might affect the interplay between these spatial and nonspatial factors. Sixteen participants underwent fMRI during an object detection task. They were presented with scenes containing either a person or a car, located on the left or right side of the photo. Participants monitored either one or both object categories, in one or both visual hemifields. First, we investigated the interplay between spatial and nonspatial attention by comparing conditions of divided attention between categories and/or locations. We then assessed the contribution of top–down processes versus stimulus-driven signals by separately testing the effects of divided attention in target and nontarget trials. The results revealed activation of a bilateral frontoparietal network when dividing attention between the two object categories versus attending to a single category but no main effect of dividing attention between spatial locations. Within this network, the left dorsal premotor cortex and the left intraparietal sulcus were found to combine task- and stimulus-related signals. These regions showed maximal activation when participants monitored two categories at spatially separate locations and the scene included a nontarget object. We conclude that the dorsal frontoparietal cortex integrates top–down and bottom–up signals in the presence of distractors during divided attention in real-world scenes.

Download Full-text

Visual Memory for Objects in Natural Scenes: From Fixations to Object Files

The Quarterly Journal of Experimental Psychology Section A ◽

10.1080/02724980443000430 ◽

2005 ◽

Vol 58 (5) ◽

pp. 931-960 ◽

Cited By ~ 51

Author(s):

Benjamin W. Tatler ◽

Iain D. Gilchrist ◽

Michael F. Land

Keyword(s):

Real World ◽

Visual Memory ◽

Natural Scenes ◽

Position Information ◽

Object Property ◽

Immediate Recall ◽

Object Files ◽

Stability Data ◽

The Stability ◽

Types Of Information

Object descriptions are extracted and retained across saccades when observers view natural scenes. We investigated whether particular object properties are encoded and the stability of the resulting memories. We tested immediate recall of multiple types of information from real-world scenes and from computer-presented images of the same scenes. The relationship between fixations and properties of object memory was investigated. Position information was encoded and accumulated from multiple fixations. In contrast, identity and colour were encoded but did not require direct fixation and did not accumulate. In the current experiments, participants were unable to recall any information about shape or relative distances between objects. In addition, where information was encoded we found differential patterns of stability. Data from viewing real scenes and images were highly consistent, with stronger effects in the real-world conditions. Our findings imply that object files are not dependent upon the encoding of any particular object property and so are robust to dynamic visual environments.

Download Full-text

Category Boundaries and Typicality Warp the Neural Representation Space of Real-World Object Categories

Journal of Vision ◽

10.1167/15.12.8 ◽

2015 ◽

Vol 15 (12) ◽

pp. 8

Author(s):

Marius Catalin Iordan ◽

Michelle Greene ◽

Diane Beck ◽

Li Fei-Fei

Keyword(s):

Real World ◽

Neural Representation ◽

Representation Space ◽

Object Categories

Download Full-text

Task-Dependent Warping of Semantic Representations During Search for Visual Action Categories

10.1101/2021.06.17.448789 ◽

2021 ◽

Author(s):

Mo Shahdloo ◽

Emin Çelik ◽

Burcu A Urgen ◽

Jack L. Gallant ◽

Tolga Çukur

Keyword(s):

Visual Search ◽

Social Interactions ◽

Brain Activity ◽

Semantic Representation ◽

Natural Scenes ◽

Action Perception ◽

Semantic Representations ◽

Object Categories ◽

Search For Objects ◽

The Brain

Object and action perception in cluttered dynamic natural scenes relies on efficient allocation of limited brain resources to prioritize the attended targets over distractors. It has been suggested that during visual search for objects, distributed semantic representation of hundreds of object categories is warped to expand the representation of targets. Yet, little is known about whether and where in the brain visual search for action categories modulates semantic representations. To address this fundamental question, we studied human brain activity recorded via functional magnetic resonance imaging while subjects viewed natural movies and searched for either communication or locomotion actions. We find that attention directed to action categories elicits tuning shifts that warp semantic representations broadly across neocortex, and that these shifts interact with intrinsic selectivity of cortical voxels for target actions. These results suggest that attention serves to facilitate task performance during social interactions by dynamically shifting semantic selectivity towards target actions, and that tuning shifts are a general feature of conceptual representations in the brain.

Download Full-text

MODELING AUDITORY PATHWAY FOR INTELLIGENT INFORMATION ACQUISITION

International Journal of Information Acquisition ◽

10.1142/s0219878904000367 ◽

2004 ◽

Vol 01 (04) ◽

pp. 345-356

Author(s):

HYUNG-MIN PARK ◽

JONG-HWAN LEE ◽

TAESU KIM ◽

UN-MIN BAE ◽

BYUNG TAEK KIM ◽

...

Keyword(s):

Feature Extraction ◽

Real World ◽

Information Acquisition ◽

Recognition Performance ◽

Auditory Pathway ◽

Blind Signal Separation ◽

Top Down ◽

Auditory Model ◽

Time Frequency ◽

Binaural Processing

An auditory model has been developed for an intelligent speech information acquisition system in real-world noisy environment. The developed mathematical model of the human auditory pathway consists of three components, i.e. the nonlinear feature extraction from cochlea to auditory cortex, the binaural processing at superior olivery complex, and the top-down attention from higher brain to the cochlea. The feature extraction is based on information-theoretic sparse coding throughout the auditory pathway. Also, the time-frequency masking is incorporated as a model of the lateral inhibition in both time and frequency domain. The binaural processing is modeled as the blind signal separation and adaptive noise canceling based on the independent component analysis with hundreds of time-delays for noisy reverberated signals. The Top-Down (TD) attention comes from familiarity and/or importance of the sensory information, i.e. the sound, and a simple but efficient TD attention model had been developed based on the error backpropagation algorithm. Also, the binaural processing and top-down attention are combined for speech signals with heavy noises. This auditory model requires extensive computing, and special hardware had been developed for real-time applications. Experimental results demonstrate much better recognition performance in real-world noisy environments.

Download Full-text

Learning human poses in natural scenes

10.32469/10355/66196 ◽

2018 ◽

Author(s):

◽

Guanghan Ning

Keyword(s):

Computer Vision ◽

Pose Estimation ◽

The Body ◽

Human Pose Estimation ◽

Natural Scenes ◽

Top Down ◽

University Of Missouri ◽

Single Person ◽

Human Pose ◽

High Level

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] The task of human pose estimation in natural scenes is to determine the precise pixel locations of body keypoints. It is very important for many high-level computer vision tasks, including action and activity recognition, human-computer interaction, motion capture, and animation. We cover two different approaches for this task: top-down approach and bottom-up approach. In the top-down approach, we propose a human tracking method called ROLO that localizes each person. We then propose a state-of-the-art single-person human pose estimator that predicts the body keypoints of each individual. In the bottomup approach, we propose an efficient multi-person pose estimator with which we participated in a PoseTrack challenge [11]. On top of these, we propose to employ adversarial training to further boost the performance of single-person human pose estimator while generating synthetic images. We also propose a novel PoSeg network that jointly estimates the multi-person human poses and semantically segment the portraits of these persons at pixel-level. Lastly, we extend some of the proposed methods on human pose estimation and portrait segmentation to the task of human parsing, a more finegrained computer vision perception of humans.

Download Full-text

Ultrafast Object Detection in Naturalistic Vision Relies on Ultrafast Distractor Suppression

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01437 ◽

2019 ◽

Vol 31 (10) ◽

pp. 1563-1572 ◽

Cited By ~ 1

Author(s):

Clayton Hickey ◽

Daniele Pollicino ◽

Giacomo Bertazzoli ◽

Ludwig Barbaro

Keyword(s):

Visual Cortex ◽

Spatial Attention ◽

Brain Activity ◽

Natural Scenes ◽

Search Efficiency ◽

Object Representations ◽

Target Presence ◽

Object Categories ◽

Frontal Brain ◽

Scene Information

People are quicker to detect examples of real-world object categories in natural scenes than is predicted by classic attention theories. One explanation for this puzzle suggests that experience renders the visual system sensitive to midlevel features diagnosing target presence. These are detected without the need for spatial attention, much as occurs for targets defined by low-level features like color or orientation. The alternative is that naturalistic search relies on spatial attention but is highly efficient because global scene information can be used to quickly reject nontarget objects and locations. Here, we use ERPs to differentiate between these possibilities. Results show that hallmark evidence of ultrafast target detection in frontal brain activity is preceded by an index of spatially specific distractor suppression in visual cortex. Naturalistic search for heterogenous targets therefore appears to rely on spatial operations that act on neural object representations, as predicted by classic attention theory. People appear able to rapidly reject nontarget objects and locations, consistent with the idea that global scene information is used to constrain naturalistic search and increase search efficiency.

Download Full-text

Early Top–Down Control of Visual Processing Predicts Working Memory Performance

Journal of Cognitive Neuroscience ◽

10.1162/jocn.2009.21257 ◽

2010 ◽

Vol 22 (6) ◽

pp. 1224-1234 ◽

Cited By ~ 83

Author(s):

Aaron M. Rutman ◽

Wesley C. Clapp ◽

James Z. Chadick ◽

Adam Gazzaley

Keyword(s):

Working Memory ◽

Selective Attention ◽

Visual Processing ◽

Temporal Dynamics ◽

Memory Performance ◽

Event Related Potential ◽

Natural Scenes ◽

Attention And Memory ◽

Top Down ◽

Sensory Neural

Selective attention confers a behavioral benefit on both perceptual and working memory (WM) performance, often attributed to top–down modulation of sensory neural processing. However, the direct relationship between early activity modulation in sensory cortices during selective encoding and subsequent WM performance has not been established. To explore the influence of selective attention on WM recognition, we used electroencephalography to study the temporal dynamics of top–down modulation in a selective, delayed-recognition paradigm. Participants were presented with overlapped, “double-exposed” images of faces and natural scenes, and were instructed to either remember the face or the scene while simultaneously ignoring the other stimulus. Here, we present evidence that the degree to which participants modulate the early P100 (97–129 msec) event-related potential during selective stimulus encoding significantly correlates with their subsequent WM recognition. These results contribute to our evolving understanding of the mechanistic overlap between attention and memory.

Download Full-text

Impact of dynamic bottom-up features and top-down control on the visual exploration of moving real-world scenes in hemispatial neglect

Neuropsychologia ◽

10.1016/j.neuropsychologia.2012.06.012 ◽

2012 ◽

Vol 50 (10) ◽

pp. 2415-2425 ◽

Cited By ~ 18

Author(s):

Björn Machner ◽

Michael Dorr ◽

Andreas Sprenger ◽

Janina von der Gablentz ◽

Wolfgang Heide ◽

...

Keyword(s):

Real World ◽

Visual Exploration ◽

Hemispatial Neglect ◽

Top Down ◽

Bottom Up

Download Full-text

Visual responses with and without fixation: neurons in premotor cortex encode spatial locations independently of eye position

Experimental Brain Research ◽

10.1007/s002210050291 ◽

1998 ◽

Vol 118 (3) ◽

pp. 373-380 ◽

Cited By ~ 54

Author(s):

M. S. A. Graziano ◽

Charles G. Gross

Keyword(s):

Premotor Cortex ◽

Eye Position ◽

Visual Responses ◽

Spatial Locations

Download Full-text

Real-world visual statistics and infants' first-learned object names

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2016.0055 ◽

2017 ◽

Vol 372 (1711) ◽

pp. 20160055 ◽

Cited By ~ 48

Author(s):

Elizabeth M. Clerkin ◽

Elizabeth Hart ◽

James M. Rehg ◽

Chen Yu ◽

Linda B. Smith

Keyword(s):

Statistical Learning ◽

Frequency Distribution ◽

Word Learning ◽

Real World ◽

Computational Models ◽

Unsolved Problem ◽

Statistical Structure ◽

Cognitive Sciences ◽

Object Categories ◽

Small Set

We offer a new solution to the unsolved problem of how infants break into word learning based on the visual statistics of everyday infant-perspective scenes. Images from head camera video captured by 8 1/2 to 10 1/2 month-old infants at 147 at-home mealtime events were analysed for the objects in view. The images were found to be highly cluttered with many different objects in view. However, the frequency distribution of object categories was extremely right skewed such that a very small set of objects was pervasively present—a fact that may substantially reduce the problem of referential ambiguity. The statistical structure of objects in these infant egocentric scenes differs markedly from that in the training sets used in computational models and in experiments on statistical word-referent learning. Therefore, the results also indicate a need to re-examine current explanations of how infants break into word learning. This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences’.

Download Full-text