Large-scale hyperparameter search for predicting human brain responses in the Algonauts challenge

Image features computed by specific convolutional artificial neural networks (ANNs) can be used to make state-of-the-art predictions of primate ventral stream responses to visual stimuli.However, in addition to selecting the specific ANN and layer that is used, the modeler makes other choices in preprocessing the stimulus image and generating brain predictions from ANN features. The effect of these choices on brain predictivity is currently underexplored.Here, we directly evaluated many of these choices by performing a grid search over network architectures, layers, image preprocessing strategies, feature pooling mechanisms, and the use of dimensionality reduction. Our goal was to identify model configurations that produce responses to visual stimuli that are most similar to the human neural representations, as measured by human fMRI and MEG responses. In total, we evaluated more than 140,338 model configurations. We found that specific configurations of CORnet-S best predicted fMRI responses in early visual cortex, and CORnet-R and SqueezeNet models best predicted fMRI responses in inferior temporal cortex. We found specific configurations of VGG-16 and CORnet-S models that best predicted the MEG responses.We also observed that downsizing input images to ~50-75% of the input tensor size lead to better performing models compared to no downsizing (the default choice in most brain models for vision). Taken together, we present evidence that brain predictivity is sensitive not only to which ANN architecture and layer is used, but choices in image preprocessing and feature postprocessing, and these choices should be further explored.

Download Full-text

1631 Non-linear interactions of two different visual stimuli in the receptive fields of monkey's inferior temporal cortex

Neuroscience Research Supplements ◽

10.1016/s0921-8696(05)81200-9 ◽

1993 ◽

Vol 18 ◽

pp. S185

Author(s):

Takayuki Sato

Keyword(s):

Visual Stimuli ◽

Temporal Cortex ◽

Receptive Fields ◽

Inferior Temporal Cortex ◽

Non Linear ◽

Linear Interactions

Download Full-text

Categorization in monkey inferior temporal cortex determined by image features, not acquired knowledge

Journal of Vision ◽

10.1167/17.10.1237 ◽

2017 ◽

Vol 17 (10) ◽

pp. 1237

Author(s):

Xiaomin Yue ◽

Marissa Yetter ◽

Leslie Ungerleider

Keyword(s):

Temporal Cortex ◽

Image Features ◽

Inferior Temporal Cortex

Download Full-text

Towards large-scale, high resolution maps of object selectivity in inferior temporal cortex

Frontiers in Neuroscience ◽

10.3389/conf.fnins.2010.03.00154 ◽

2010 ◽

Vol 4 ◽

Cited By ~ 1

Author(s):

DiCarlo James

Keyword(s):

High Resolution ◽

Large Scale ◽

Temporal Cortex ◽

Inferior Temporal Cortex

Download Full-text

Differential modulation of neuronal activity in the macaque inferior temporal cortex by attention to global and local features of visual stimuli

Neuroscience Research ◽

10.1016/s0168-0102(00)81115-3 ◽

2000 ◽

Vol 38 ◽

pp. S44

Author(s):

H Tanaka

Keyword(s):

Neuronal Activity ◽

Visual Stimuli ◽

Temporal Cortex ◽

Local Features ◽

Inferior Temporal Cortex ◽

Differential Modulation ◽

Global And Local

Download Full-text

Optimal temporal resolution for decoding of visual stimuli in inferior temporal cortex

2014 21th Iranian Conference on Biomedical Engineering (ICBME) ◽

10.1109/icbme.2014.7043903 ◽

2014 ◽

Author(s):

A. Babolhavaeji ◽

S. Karimi ◽

A. Ghaffari ◽

A. Hamidinekoo ◽

B. Vosoughi-Vahdat

Keyword(s):

Temporal Resolution ◽

Visual Stimuli ◽

Temporal Cortex ◽

Inferior Temporal Cortex

Download Full-text

Neural ensemble coding in inferior temporal cortex

Journal of Neurophysiology ◽

10.1152/jn.1994.71.6.2325 ◽

1994 ◽

Vol 71 (6) ◽

pp. 2325-2337 ◽

Cited By ~ 91

Author(s):

P. M. Gochin ◽

M. Colombo ◽

G. A. Dorfman ◽

G. L. Gerstein ◽

C. G. Gross

Keyword(s):

Time Scale ◽

Visual Stimuli ◽

Temporal Cortex ◽

Population Characteristics ◽

Basis Set ◽

Inferior Temporal Cortex ◽

Stimulus Discrimination ◽

Linear Discriminant ◽

Neural Ensemble ◽

Discrimination Capacity

1. Isolated, single-neuron extracellular potentials were recorded sequentially in area TE of the inferior temporal cortex (IT) of two macaque monkeys (n = 58 and n = 41 neurons). Data were obtained while the animals were performing a paired-associate task. The task utilized five stimuli and eight stimulus pairings (4 correct and 4 incorrect). Data were evaluated as average spike rate during experimental epochs of 100 or 400 ms. Single-unit and population characteristics were measured using a form of linear discriminant analysis and information theoretic measures. To evaluate the significance of covariance on population code measures, additional data consisting of simultaneous recordings from < or = 8 isolated neurons (n = 37) were obtained from a third macaque monkey that was passively viewing visual stimuli. 2. On average, 43% of IT neurons were activated by any of the stimuli used (60% if those inhibited also are included). Yet the neurons were rather unique in the relative magnitude of their responses to each stimulus in the test set. These results suggest that information may be represented in IT by the pattern of activity across neurons and that the representation is not sparsely coded. It is further suggested that the representation scheme may have similarities to DNA or computer codes wherein a coding element is not a local parametric descriptor. This is a departure from the V1 representation, which appears to be both local and parametric. It is also different from theories of IT representation that suggest a constructive basis set or “alphabet”. From this view, determination of stimulus discrimination capacity in IT should be evaluated by measures of population activity patterns. 3. Evaluation of small groups of simultaneously recorded neurons obtained during a fixation task suggests that little information about visual stimuli is conveyed by covariance of activity in IT when a 100-ms time scale is used as in this study. This finding is consistent with a prior report, by Gochin et al., which used a 1-ms time scale and failed to find neural activity coherence or oscillations dependent on stimuli. 4. Population-stimulus-discrimination capacity measures were influenced by the number of neurons and to some extent the number and type of stimuli. 5. Information conveyed by individual neurons (mutual information) averaged 0.26 bits. The distribution of information values was unimodal and is therefore more consistent with a distributed than a local coding scheme.(ABSTRACT TRUNCATED AT 400 WORDS)

Download Full-text

Neural representations of perceptual and semantic identities of individuals in the anterior ventral inferior temporal cortex of monkeys

Japanese Psychological Research ◽

10.1111/jpr.12026 ◽

2013 ◽

Vol 56 (1) ◽

pp. 58-75

Author(s):

Satoshi Eifuku

Keyword(s):

Temporal Cortex ◽

Inferior Temporal Cortex ◽

Neural Representations

Download Full-text

Decoding Multivoxel Representations of Affective Scenes in Retinotopic Visual Cortex

10.1101/2020.08.06.239764 ◽

2020 ◽

Author(s):

Ke Bo ◽

Siyang Yin ◽

Yuelu Liu ◽

Zhenhong Hu ◽

Sreenivasan Meyyapan ◽

...

Keyword(s):

Visual Cortex ◽

Large Scale ◽

Effective Connectivity ◽

Brain Regions ◽

Event Related Potential ◽

Affective Content ◽

Neural Representations ◽

Visual Areas ◽

Early Visual Cortex ◽

Decoding Accuracy

AbstractThe perception of opportunities and threats in complex scenes represents one of the main functions of the human visual system. In the laboratory, its neurophysiological basis is often studied by having observers view pictures varying in affective content. This body of work has consistently shown that viewing emotionally engaging, compared to neutral, pictures (1) heightens blood flow in limbic structures and frontoparietal cortex, as well as in anterior ventral and dorsal visual cortex, and (2) prompts an increase in the late positive event-related potential (LPP), a scalp-recorded and time-sensitive index of engagement within the network of aforementioned neural structures. The role of retinotopic visual cortex in this process has, however, been contentious, with competing theoretical notions predicting the presence versus absence of emotion-specific signals in retinotopic visual areas. The present study used multimodal neuroimaging and machine learning to address this question by examining the large-scale neural representations of affective pictures. Recording EEG and fMRI simultaneously while observers viewed pleasant, unpleasant, and neutral affective pictures, and applying multivariate pattern analysis to single-trial BOLD activities in retinotopic visual cortex, we identified three robust findings: First, unpleasant-versus-neutral decoding accuracy, as well as pleasant-versus-neutral decoding accuracy, were well above chance level in all retinotopic visual areas, including primary visual cortex. Second, the decoding accuracy in ventral visual cortex, but not in early visual cortex or dorsal visual cortex, was significantly correlated with LPP amplitude. Third, effective connectivity from amygdala to ventral visual cortex predicted unpleasant-versus-neutral decoding accuracy, and effective connectivity from ventral frontal cortex to ventral visual cortex predicted pleasant-versus-neutral decoding accuracy. These results suggest that affective pictures evoked valence-specific multivoxel neural representations in retinotopic visual cortex and that these multivoxel representations were influenced by reentry signals from limbic and frontal brain regions.

Download Full-text

Ramp-shaped neural tuning supports graded population-level representation of the object-to-scene continuum

10.1101/2022.01.06.475244 ◽

2022 ◽

Author(s):

Jeongho Park ◽

Emilie Josephs ◽

Talia Konkle

Keyword(s):

Spatial Scale ◽

Neural Coding ◽

Large Scale ◽

Temporal Cortex ◽

Population Level ◽

Linear Representation ◽

Small Region ◽

Indoor Environments ◽

Brain Responses ◽

Class Room

We can easily perceive the spatial scale depicted in a picture, regardless of whether it is a small space (e.g., a close-up view of a chair) or a much larger space (e.g., an entire class room). How does the human visual system encode this continuous dimension? Here, we investigated the underlying neural coding of depicted spatial scale, by examining the voxel tuning and topographic organization of brain responses. We created naturalistic yet carefully-controlled stimuli by constructing virtual indoor environments, and rendered a series of snapshots to smoothly sample between a close-up view of the central object and far-scale view of the full environment (object-to-scene continuum). Human brain responses were measured to each position using functional magnetic resonance imaging. We did not find evidence for a smooth topographic mapping for the object-to-scene continuum on the cortex. Instead, we observed large swaths of cortex with opposing ramp-shaped profiles, with highest responses to one end of the object-to-scene continuum or the other, and a small region showing a weak tuning to intermediate scale views. Importantly, when we considered the multi-voxel patterns of the entire ventral occipito-temporal cortex, we found smooth and linear representation of the object-to-scene continuum. Thus, our results together suggest that depicted spatial scale is coded parametrically in large-scale population codes across the entire ventral occipito-temporal cortex.

Download Full-text