Neural Correlates of Divided Attention in Natural Scenes

2016 ◽  
Vol 28 (9) ◽  
pp. 1392-1405 ◽  
Author(s):  
Sabrina Fagioli ◽  
Emiliano Macaluso

Individuals are able to split attention between separate locations, but divided spatial attention incurs the additional requirement of monitoring multiple streams of information. Here, we investigated divided attention using photos of natural scenes, where the rapid categorization of familiar objects and prior knowledge about the likely positions of objects in the real world might affect the interplay between these spatial and nonspatial factors. Sixteen participants underwent fMRI during an object detection task. They were presented with scenes containing either a person or a car, located on the left or right side of the photo. Participants monitored either one or both object categories, in one or both visual hemifields. First, we investigated the interplay between spatial and nonspatial attention by comparing conditions of divided attention between categories and/or locations. We then assessed the contribution of top–down processes versus stimulus-driven signals by separately testing the effects of divided attention in target and nontarget trials. The results revealed activation of a bilateral frontoparietal network when dividing attention between the two object categories versus attending to a single category but no main effect of dividing attention between spatial locations. Within this network, the left dorsal premotor cortex and the left intraparietal sulcus were found to combine task- and stimulus-related signals. These regions showed maximal activation when participants monitored two categories at spatially separate locations and the scene included a nontarget object. We conclude that the dorsal frontoparietal cortex integrates top–down and bottom–up signals in the presence of distractors during divided attention in real-world scenes.

2005 ◽  
Vol 58 (5) ◽  
pp. 931-960 ◽  
Author(s):  
Benjamin W. Tatler ◽  
Iain D. Gilchrist ◽  
Michael F. Land

Object descriptions are extracted and retained across saccades when observers view natural scenes. We investigated whether particular object properties are encoded and the stability of the resulting memories. We tested immediate recall of multiple types of information from real-world scenes and from computer-presented images of the same scenes. The relationship between fixations and properties of object memory was investigated. Position information was encoded and accumulated from multiple fixations. In contrast, identity and colour were encoded but did not require direct fixation and did not accumulate. In the current experiments, participants were unable to recall any information about shape or relative distances between objects. In addition, where information was encoded we found differential patterns of stability. Data from viewing real scenes and images were highly consistent, with stronger effects in the real-world conditions. Our findings imply that object files are not dependent upon the encoding of any particular object property and so are robust to dynamic visual environments.


2015 ◽  
Vol 15 (12) ◽  
pp. 8
Author(s):  
Marius Catalin Iordan ◽  
Michelle Greene ◽  
Diane Beck ◽  
Li Fei-Fei

2021 ◽  
Author(s):  
Mo Shahdloo ◽  
Emin Çelik ◽  
Burcu A Urgen ◽  
Jack L. Gallant ◽  
Tolga Çukur

Object and action perception in cluttered dynamic natural scenes relies on efficient allocation of limited brain resources to prioritize the attended targets over distractors. It has been suggested that during visual search for objects, distributed semantic representation of hundreds of object categories is warped to expand the representation of targets. Yet, little is known about whether and where in the brain visual search for action categories modulates semantic representations. To address this fundamental question, we studied human brain activity recorded via functional magnetic resonance imaging while subjects viewed natural movies and searched for either communication or locomotion actions. We find that attention directed to action categories elicits tuning shifts that warp semantic representations broadly across neocortex, and that these shifts interact with intrinsic selectivity of cortical voxels for target actions. These results suggest that attention serves to facilitate task performance during social interactions by dynamically shifting semantic selectivity towards target actions, and that tuning shifts are a general feature of conceptual representations in the brain.


2004 ◽  
Vol 01 (04) ◽  
pp. 345-356
Author(s):  
HYUNG-MIN PARK ◽  
JONG-HWAN LEE ◽  
TAESU KIM ◽  
UN-MIN BAE ◽  
BYUNG TAEK KIM ◽  
...  

An auditory model has been developed for an intelligent speech information acquisition system in real-world noisy environment. The developed mathematical model of the human auditory pathway consists of three components, i.e. the nonlinear feature extraction from cochlea to auditory cortex, the binaural processing at superior olivery complex, and the top-down attention from higher brain to the cochlea. The feature extraction is based on information-theoretic sparse coding throughout the auditory pathway. Also, the time-frequency masking is incorporated as a model of the lateral inhibition in both time and frequency domain. The binaural processing is modeled as the blind signal separation and adaptive noise canceling based on the independent component analysis with hundreds of time-delays for noisy reverberated signals. The Top-Down (TD) attention comes from familiarity and/or importance of the sensory information, i.e. the sound, and a simple but efficient TD attention model had been developed based on the error backpropagation algorithm. Also, the binaural processing and top-down attention are combined for speech signals with heavy noises. This auditory model requires extensive computing, and special hardware had been developed for real-time applications. Experimental results demonstrate much better recognition performance in real-world noisy environments.


2018 ◽  
Author(s):  
◽  
Guanghan Ning

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] The task of human pose estimation in natural scenes is to determine the precise pixel locations of body keypoints. It is very important for many high-level computer vision tasks, including action and activity recognition, human-computer interaction, motion capture, and animation. We cover two different approaches for this task: top-down approach and bottom-up approach. In the top-down approach, we propose a human tracking method called ROLO that localizes each person. We then propose a state-of-the-art single-person human pose estimator that predicts the body keypoints of each individual. In the bottomup approach, we propose an efficient multi-person pose estimator with which we participated in a PoseTrack challenge [11]. On top of these, we propose to employ adversarial training to further boost the performance of single-person human pose estimator while generating synthetic images. We also propose a novel PoSeg network that jointly estimates the multi-person human poses and semantically segment the portraits of these persons at pixel-level. Lastly, we extend some of the proposed methods on human pose estimation and portrait segmentation to the task of human parsing, a more finegrained computer vision perception of humans.


2019 ◽  
Vol 31 (10) ◽  
pp. 1563-1572 ◽  
Author(s):  
Clayton Hickey ◽  
Daniele Pollicino ◽  
Giacomo Bertazzoli ◽  
Ludwig Barbaro

People are quicker to detect examples of real-world object categories in natural scenes than is predicted by classic attention theories. One explanation for this puzzle suggests that experience renders the visual system sensitive to midlevel features diagnosing target presence. These are detected without the need for spatial attention, much as occurs for targets defined by low-level features like color or orientation. The alternative is that naturalistic search relies on spatial attention but is highly efficient because global scene information can be used to quickly reject nontarget objects and locations. Here, we use ERPs to differentiate between these possibilities. Results show that hallmark evidence of ultrafast target detection in frontal brain activity is preceded by an index of spatially specific distractor suppression in visual cortex. Naturalistic search for heterogenous targets therefore appears to rely on spatial operations that act on neural object representations, as predicted by classic attention theory. People appear able to rapidly reject nontarget objects and locations, consistent with the idea that global scene information is used to constrain naturalistic search and increase search efficiency.


2010 ◽  
Vol 22 (6) ◽  
pp. 1224-1234 ◽  
Author(s):  
Aaron M. Rutman ◽  
Wesley C. Clapp ◽  
James Z. Chadick ◽  
Adam Gazzaley

Selective attention confers a behavioral benefit on both perceptual and working memory (WM) performance, often attributed to top–down modulation of sensory neural processing. However, the direct relationship between early activity modulation in sensory cortices during selective encoding and subsequent WM performance has not been established. To explore the influence of selective attention on WM recognition, we used electroencephalography to study the temporal dynamics of top–down modulation in a selective, delayed-recognition paradigm. Participants were presented with overlapped, “double-exposed” images of faces and natural scenes, and were instructed to either remember the face or the scene while simultaneously ignoring the other stimulus. Here, we present evidence that the degree to which participants modulate the early P100 (97–129 msec) event-related potential during selective stimulus encoding significantly correlates with their subsequent WM recognition. These results contribute to our evolving understanding of the mechanistic overlap between attention and memory.


2012 ◽  
Vol 50 (10) ◽  
pp. 2415-2425 ◽  
Author(s):  
Björn Machner ◽  
Michael Dorr ◽  
Andreas Sprenger ◽  
Janina von der Gablentz ◽  
Wolfgang Heide ◽  
...  

2017 ◽  
Vol 372 (1711) ◽  
pp. 20160055 ◽  
Author(s):  
Elizabeth M. Clerkin ◽  
Elizabeth Hart ◽  
James M. Rehg ◽  
Chen Yu ◽  
Linda B. Smith

We offer a new solution to the unsolved problem of how infants break into word learning based on the visual statistics of everyday infant-perspective scenes. Images from head camera video captured by 8 1/2 to 10 1/2 month-old infants at 147 at-home mealtime events were analysed for the objects in view. The images were found to be highly cluttered with many different objects in view. However, the frequency distribution of object categories was extremely right skewed such that a very small set of objects was pervasively present—a fact that may substantially reduce the problem of referential ambiguity. The statistical structure of objects in these infant egocentric scenes differs markedly from that in the training sets used in computational models and in experiments on statistical word-referent learning. Therefore, the results also indicate a need to re-examine current explanations of how infants break into word learning. This article is part of the themed issue ‘New frontiers for statistical learning in the cognitive sciences’.


Sign in / Sign up

Export Citation Format

Share Document