scholarly journals Causal evidence for a double dissociation between object- and scene-selective regions of visual cortex: A pre-registered TMS replication study

2020 ◽  
Author(s):  
Miles Wischnewski ◽  
Marius V. Peelen

AbstractNatural scenes are characterized by individual objects as well as by global scene properties such as spatial layout. Functional neuroimaging research has shown that this distinction between object and scene processing is one of the main organizing principles of human high-level visual cortex. For example, object-selective regions, including the lateral occipital complex (LOC), were shown to represent object content (but not scene layout), while scene-selective regions, including the occipital place area (OPA), were shown to represent scene layout (but not object content). Causal evidence for a double dissociation between LOC and OPA in representing objects and scenes is currently limited, however. One TMS experiment, conducted in a relatively small sample (N=13), reported an interaction between LOC and OPA stimulation and object and scene recognition performance (Dilks et al., 2013). Here, we present a high-powered pre-registered replication of this study (N=72, including male and female human participants), using group-average fMRI coordinates to target LOC and OPA. Results revealed unambiguous evidence for a double dissociation between LOC and OPA: Relative to vertex stimulation, TMS over LOC selectively impaired the recognition of objects, while TMS over OPA selectively impaired the recognition of scenes. Furthermore, we found that these effects were stable over time and consistent across individual objects and scenes. These results show that LOC and OPA can be reliably and selectively targeted with TMS, even when defined based on group-average fMRI coordinates. More generally, they support the distinction between object and scene processing as an organizing principle of human high-level visual cortex.Significance StatementOur daily-life environments are characterized both by individual objects and by global scene properties. The distinction between object and scene processing features prominently in visual cognitive neuroscience, with fMRI studies showing that this distinction is one of the main organizing principles of human high-level visual cortex. However, causal evidence for the selective involvement of object- and scene-selective regions in processing their preferred category is less conclusive. Here, testing a large sample (N=72) using an established paradigm and a pre-registered protocol, we found that TMS over object-selective cortex (LOC) selectively impaired object recognition while TMS over scene-selective cortex (OPA) selectively impaired scene recognition. These results provide conclusive causal evidence for the distinction between object and scene processing in human visual cortex.

2021 ◽  
Vol 11 (9) ◽  
pp. 3730
Author(s):  
Aniqa Dilawari ◽  
Muhammad Usman Ghani Khan ◽  
Yasser D. Al-Otaibi ◽  
Zahoor-ur Rehman ◽  
Atta-ur Rahman ◽  
...  

After the September 11 attacks, security and surveillance measures have changed across the globe. Now, surveillance cameras are installed almost everywhere to monitor video footage. Though quite handy, these cameras produce videos in a massive size and volume. The major challenge faced by security agencies is the effort of analyzing the surveillance video data collected and generated daily. Problems related to these videos are twofold: (1) understanding the contents of video streams, and (2) conversion of the video contents to condensed formats, such as textual interpretations and summaries, to save storage space. In this paper, we have proposed a video description framework on a surveillance dataset. This framework is based on the multitask learning of high-level features (HLFs) using a convolutional neural network (CNN) and natural language generation (NLG) through bidirectional recurrent networks. For each specific task, a parallel pipeline is derived from the base visual geometry group (VGG)-16 model. Tasks include scene recognition, action recognition, object recognition and human face specific feature recognition. Experimental results on the TRECViD, UET Video Surveillance (UETVS) and AGRIINTRUSION datasets depict that the model outperforms state-of-the-art methods by a METEOR (Metric for Evaluation of Translation with Explicit ORdering) score of 33.9%, 34.3%, and 31.2%, respectively. Our results show that our framework has distinct advantages over traditional rule-based models for the recognition and generation of natural language descriptions.


2015 ◽  
Vol 35 (36) ◽  
pp. 12412-12424 ◽  
Author(s):  
A. Stigliani ◽  
K. S. Weiner ◽  
K. Grill-Spector

2017 ◽  
Vol 8 (1) ◽  
Author(s):  
Ben Deen ◽  
Hilary Richardson ◽  
Daniel D. Dilks ◽  
Atsushi Takahashi ◽  
Boris Keil ◽  
...  

2012 ◽  
Vol 24 (2) ◽  
pp. 521-529 ◽  
Author(s):  
Frank Oppermann ◽  
Uwe Hassler ◽  
Jörg D. Jescheniak ◽  
Thomas Gruber

The human cognitive system is highly efficient in extracting information from our visual environment. This efficiency is based on acquired knowledge that guides our attention toward relevant events and promotes the recognition of individual objects as they appear in visual scenes. The experience-based representation of such knowledge contains not only information about the individual objects but also about relations between them, such as the typical context in which individual objects co-occur. The present EEG study aimed at exploring the availability of such relational knowledge in the time course of visual scene processing, using oscillatory evoked gamma-band responses as a neural correlate for a currently activated cortical stimulus representation. Participants decided whether two simultaneously presented objects were conceptually coherent (e.g., mouse–cheese) or not (e.g., crown–mushroom). We obtained increased evoked gamma-band responses for coherent scenes compared with incoherent scenes beginning as early as 70 msec after stimulus onset within a distributed cortical network, including the right temporal, the right frontal, and the bilateral occipital cortex. This finding provides empirical evidence for the functional importance of evoked oscillatory activity in high-level vision beyond the visual cortex and, thus, gives new insights into the functional relevance of neuronal interactions. It also indicates the very early availability of experience-based knowledge that might be regarded as a fundamental mechanism for the rapid extraction of the gist of a scene.


2017 ◽  
Vol 117 (1) ◽  
pp. 388-402 ◽  
Author(s):  
Michael A. Cohen ◽  
George A. Alvarez ◽  
Ken Nakayama ◽  
Talia Konkle

Visual search is a ubiquitous visual behavior, and efficient search is essential for survival. Different cognitive models have explained the speed and accuracy of search based either on the dynamics of attention or on similarity of item representations. Here, we examined the extent to which performance on a visual search task can be predicted from the stable representational architecture of the visual system, independent of attentional dynamics. Participants performed a visual search task with 28 conditions reflecting different pairs of categories (e.g., searching for a face among cars, body among hammers, etc.). The time it took participants to find the target item varied as a function of category combination. In a separate group of participants, we measured the neural responses to these object categories when items were presented in isolation. Using representational similarity analysis, we then examined whether the similarity of neural responses across different subdivisions of the visual system had the requisite structure needed to predict visual search performance. Overall, we found strong brain/behavior correlations across most of the higher-level visual system, including both the ventral and dorsal pathways when considering both macroscale sectors as well as smaller mesoscale regions. These results suggest that visual search for real-world object categories is well predicted by the stable, task-independent architecture of the visual system. NEW & NOTEWORTHY Here, we ask which neural regions have neural response patterns that correlate with behavioral performance in a visual processing task. We found that the representational structure across all of high-level visual cortex has the requisite structure to predict behavior. Furthermore, when directly comparing different neural regions, we found that they all had highly similar category-level representational structures. These results point to a ubiquitous and uniform representational structure in high-level visual cortex underlying visual object processing.


2020 ◽  
Vol 34 (07) ◽  
pp. 12862-12869
Author(s):  
Shiwen Zhang ◽  
Sheng Guo ◽  
Limin Wang ◽  
Weilin Huang ◽  
Matthew Scott

In this work, we propose Knowledge Integration Networks (referred as KINet) for video action recognition. KINet is capable of aggregating meaningful context features which are of great importance to identifying an action, such as human information and scene context. We design a three-branch architecture consisting of a main branch for action recognition, and two auxiliary branches for human parsing and scene recognition which allow the model to encode the knowledge of human and scene for action recognition. We explore two pre-trained models as teacher networks to distill the knowledge of human and scene for training the auxiliary tasks of KINet. Furthermore, we propose a two-level knowledge encoding mechanism which contains a Cross Branch Integration (CBI) module for encoding the auxiliary knowledge into medium-level convolutional features, and an Action Knowledge Graph (AKG) for effectively fusing high-level context information. This results in an end-to-end trainable framework where the three tasks can be trained collaboratively, allowing the model to compute strong context knowledge efficiently. The proposed KINet achieves the state-of-the-art performance on a large-scale action recognition benchmark Kinetics-400, with a top-1 accuracy of 77.8%. We further demonstrate that our KINet has strong capability by transferring the Kinetics-trained model to UCF-101, where it obtains 97.8% top-1 accuracy.


2019 ◽  
Vol 19 (10) ◽  
pp. 34a
Author(s):  
Emily Kubota ◽  
Jason D Yeatman

2018 ◽  
Vol 18 (10) ◽  
pp. 1149
Author(s):  
Jesse Gomez ◽  
Michael Barnett ◽  
Kalanit Grill-Spector
Keyword(s):  

2021 ◽  
Vol 15 ◽  
Author(s):  
Justin L. Balsor ◽  
Keon Arbabi ◽  
Desmond Singh ◽  
Rachel Kwan ◽  
Jonathan Zaslavsky ◽  
...  

Studying the molecular development of the human brain presents unique challenges for selecting a data analysis approach. The rare and valuable nature of human postmortem brain tissue, especially for developmental studies, means the sample sizes are small (n), but the use of high throughput genomic and proteomic methods measure the expression levels for hundreds or thousands of variables [e.g., genes or proteins (p)] for each sample. This leads to a data structure that is high dimensional (p ≫ n) and introduces the curse of dimensionality, which poses a challenge for traditional statistical approaches. In contrast, high dimensional analyses, especially cluster analyses developed for sparse data, have worked well for analyzing genomic datasets where p ≫ n. Here we explore applying a lasso-based clustering method developed for high dimensional genomic data with small sample sizes. Using protein and gene data from the developing human visual cortex, we compared clustering methods. We identified an application of sparse k-means clustering [robust sparse k-means clustering (RSKC)] that partitioned samples into age-related clusters that reflect lifespan stages from birth to aging. RSKC adaptively selects a subset of the genes or proteins contributing to partitioning samples into age-related clusters that progress across the lifespan. This approach addresses a problem in current studies that could not identify multiple postnatal clusters. Moreover, clusters encompassed a range of ages like a series of overlapping waves illustrating that chronological- and brain-age have a complex relationship. In addition, a recently developed workflow to create plasticity phenotypes (Balsor et al., 2020) was applied to the clusters and revealed neurobiologically relevant features that identified how the human visual cortex changes across the lifespan. These methods can help address the growing demand for multimodal integration, from molecular machinery to brain imaging signals, to understand the human brain’s development.


Sign in / Sign up

Export Citation Format

Share Document