scholarly journals Large-scale hyperparameter search for predicting human brain responses in the Algonauts challenge

2019 ◽  
Author(s):  
Kamila M. Jozwik ◽  
Michael Lee ◽  
Tiago Marques ◽  
Martin Schrimpf ◽  
Pouya Bashivan

Image features computed by specific convolutional artificial neural networks (ANNs) can be used to make state-of-the-art predictions of primate ventral stream responses to visual stimuli.However, in addition to selecting the specific ANN and layer that is used, the modeler makes other choices in preprocessing the stimulus image and generating brain predictions from ANN features. The effect of these choices on brain predictivity is currently underexplored.Here, we directly evaluated many of these choices by performing a grid search over network architectures, layers, image preprocessing strategies, feature pooling mechanisms, and the use of dimensionality reduction. Our goal was to identify model configurations that produce responses to visual stimuli that are most similar to the human neural representations, as measured by human fMRI and MEG responses. In total, we evaluated more than 140,338 model configurations. We found that specific configurations of CORnet-S best predicted fMRI responses in early visual cortex, and CORnet-R and SqueezeNet models best predicted fMRI responses in inferior temporal cortex. We found specific configurations of VGG-16 and CORnet-S models that best predicted the MEG responses.We also observed that downsizing input images to ~50-75% of the input tensor size lead to better performing models compared to no downsizing (the default choice in most brain models for vision). Taken together, we present evidence that brain predictivity is sensitive not only to which ANN architecture and layer is used, but choices in image preprocessing and feature postprocessing, and these choices should be further explored.

2017 ◽  
Vol 17 (10) ◽  
pp. 1237
Author(s):  
Xiaomin Yue ◽  
Marissa Yetter ◽  
Leslie Ungerleider

1994 ◽  
Vol 71 (6) ◽  
pp. 2325-2337 ◽  
Author(s):  
P. M. Gochin ◽  
M. Colombo ◽  
G. A. Dorfman ◽  
G. L. Gerstein ◽  
C. G. Gross

1. Isolated, single-neuron extracellular potentials were recorded sequentially in area TE of the inferior temporal cortex (IT) of two macaque monkeys (n = 58 and n = 41 neurons). Data were obtained while the animals were performing a paired-associate task. The task utilized five stimuli and eight stimulus pairings (4 correct and 4 incorrect). Data were evaluated as average spike rate during experimental epochs of 100 or 400 ms. Single-unit and population characteristics were measured using a form of linear discriminant analysis and information theoretic measures. To evaluate the significance of covariance on population code measures, additional data consisting of simultaneous recordings from < or = 8 isolated neurons (n = 37) were obtained from a third macaque monkey that was passively viewing visual stimuli. 2. On average, 43% of IT neurons were activated by any of the stimuli used (60% if those inhibited also are included). Yet the neurons were rather unique in the relative magnitude of their responses to each stimulus in the test set. These results suggest that information may be represented in IT by the pattern of activity across neurons and that the representation is not sparsely coded. It is further suggested that the representation scheme may have similarities to DNA or computer codes wherein a coding element is not a local parametric descriptor. This is a departure from the V1 representation, which appears to be both local and parametric. It is also different from theories of IT representation that suggest a constructive basis set or “alphabet”. From this view, determination of stimulus discrimination capacity in IT should be evaluated by measures of population activity patterns. 3. Evaluation of small groups of simultaneously recorded neurons obtained during a fixation task suggests that little information about visual stimuli is conveyed by covariance of activity in IT when a 100-ms time scale is used as in this study. This finding is consistent with a prior report, by Gochin et al., which used a 1-ms time scale and failed to find neural activity coherence or oscillations dependent on stimuli. 4. Population-stimulus-discrimination capacity measures were influenced by the number of neurons and to some extent the number and type of stimuli. 5. Information conveyed by individual neurons (mutual information) averaged 0.26 bits. The distribution of information values was unimodal and is therefore more consistent with a distributed than a local coding scheme.(ABSTRACT TRUNCATED AT 400 WORDS)


2020 ◽  
Author(s):  
Ke Bo ◽  
Siyang Yin ◽  
Yuelu Liu ◽  
Zhenhong Hu ◽  
Sreenivasan Meyyapan ◽  
...  

AbstractThe perception of opportunities and threats in complex scenes represents one of the main functions of the human visual system. In the laboratory, its neurophysiological basis is often studied by having observers view pictures varying in affective content. This body of work has consistently shown that viewing emotionally engaging, compared to neutral, pictures (1) heightens blood flow in limbic structures and frontoparietal cortex, as well as in anterior ventral and dorsal visual cortex, and (2) prompts an increase in the late positive event-related potential (LPP), a scalp-recorded and time-sensitive index of engagement within the network of aforementioned neural structures. The role of retinotopic visual cortex in this process has, however, been contentious, with competing theoretical notions predicting the presence versus absence of emotion-specific signals in retinotopic visual areas. The present study used multimodal neuroimaging and machine learning to address this question by examining the large-scale neural representations of affective pictures. Recording EEG and fMRI simultaneously while observers viewed pleasant, unpleasant, and neutral affective pictures, and applying multivariate pattern analysis to single-trial BOLD activities in retinotopic visual cortex, we identified three robust findings: First, unpleasant-versus-neutral decoding accuracy, as well as pleasant-versus-neutral decoding accuracy, were well above chance level in all retinotopic visual areas, including primary visual cortex. Second, the decoding accuracy in ventral visual cortex, but not in early visual cortex or dorsal visual cortex, was significantly correlated with LPP amplitude. Third, effective connectivity from amygdala to ventral visual cortex predicted unpleasant-versus-neutral decoding accuracy, and effective connectivity from ventral frontal cortex to ventral visual cortex predicted pleasant-versus-neutral decoding accuracy. These results suggest that affective pictures evoked valence-specific multivoxel neural representations in retinotopic visual cortex and that these multivoxel representations were influenced by reentry signals from limbic and frontal brain regions.


2022 ◽  
Author(s):  
Jeongho Park ◽  
Emilie Josephs ◽  
Talia Konkle

We can easily perceive the spatial scale depicted in a picture, regardless of whether it is a small space (e.g., a close-up view of a chair) or a much larger space (e.g., an entire class room). How does the human visual system encode this continuous dimension? Here, we investigated the underlying neural coding of depicted spatial scale, by examining the voxel tuning and topographic organization of brain responses. We created naturalistic yet carefully-controlled stimuli by constructing virtual indoor environments, and rendered a series of snapshots to smoothly sample between a close-up view of the central object and far-scale view of the full environment (object-to-scene continuum). Human brain responses were measured to each position using functional magnetic resonance imaging. We did not find evidence for a smooth topographic mapping for the object-to-scene continuum on the cortex. Instead, we observed large swaths of cortex with opposing ramp-shaped profiles, with highest responses to one end of the object-to-scene continuum or the other, and a small region showing a weak tuning to intermediate scale views. Importantly, when we considered the multi-voxel patterns of the entire ventral occipito-temporal cortex, we found smooth and linear representation of the object-to-scene continuum. Thus, our results together suggest that depicted spatial scale is coded parametrically in large-scale population codes across the entire ventral occipito-temporal cortex.


PLoS ONE ◽  
2011 ◽  
Vol 6 (4) ◽  
pp. e18913 ◽  
Author(s):  
Satoshi Eifuku ◽  
Wania C. De Souza ◽  
Ryuzaburo Nakata ◽  
Taketoshi Ono ◽  
Ryoi Tamura

Sign in / Sign up

Export Citation Format

Share Document