The neural architecture of language: Integrative modeling converges on predictive processing

2021 ◽  
Vol 118 (45) ◽  
pp. e2105646118
Author(s):  
Martin Schrimpf ◽  
Idan Asher Blank ◽  
Greta Tuckute ◽  
Carina Kauf ◽  
Eghbal A. Hosseini ◽  
...  

The neuroscience of perception has recently been revolutionized with an integrative modeling approach in which computation, brain function, and behavior are linked across many datasets and many computational models. By revealing trends across models, this approach yields novel insights into cognitive and neural mechanisms in the target domain. We here present a systematic study taking this approach to higher-level cognition: human language processing, our species’ signature cognitive skill. We find that the most powerful “transformer” models predict nearly 100% of explainable variance in neural responses to sentences and generalize across different datasets and imaging modalities (functional MRI and electrocorticography). Models’ neural fits (“brain score”) and fits to behavioral responses are both strongly correlated with model accuracy on the next-word prediction task (but not other language tasks). Model architecture appears to substantially contribute to neural fit. These results provide computationally explicit evidence that predictive processing fundamentally shapes the language comprehension mechanisms in the human brain.

Author(s):  
Martin Schrimpf ◽  
Idan Blank ◽  
Greta Tuckute ◽  
Carina Kauf ◽  
Eghbal A. Hosseini ◽  
...  

AbstractThe neuroscience of perception has recently been revolutionized with an integrative reverse-engineering approach in which computation, brain function, and behavior are linked across many different datasets and many computational models. We here present a first systematic study taking this approach into higher-level cognition: human language processing, our species’ signature cognitive skill. We find that the most powerful ‘transformer’ networks predict neural responses at nearly 100% and generalize across different datasets and data types (fMRI, ECoG). Across models, significant correlations are observed among all three metrics of performance: neural fit, fit to behavioral responses, and accuracy on the next-word prediction task (but not other language tasks), consistent with the long-standing hypothesis that the brain’s language system is optimized for predictive processing. Model architectures with initial weights further perform surprisingly similar to final trained models, suggesting that inherent structure – and not just experience with language – crucially contributes to a model’s match to the brain.


2020 ◽  
Author(s):  
Sreejan Kumar ◽  
Cameron T. Ellis ◽  
Thomas O’Connell ◽  
Marvin M Chun ◽  
Nicholas B. Turk-Browne

AbstractThe extent to which brain functions are localized or distributed is a foundational question in neuroscience. In the human brain, common fMRI methods such as cluster correction, atlas parcellation, and anatomical searchlight are biased by design toward finding localized representations. Here we introduce the functional searchlight approach as an alternative to anatomical searchlight analysis, the most commonly used exploratory multivariate fMRI technique. Functional searchlight removes any anatomical bias by grouping voxels based only on functional similarity and ignoring anatomical proximity. We report evidence that visual and auditory features from deep neural networks and semantic features from a natural language processing model are more widely distributed across the brain than previously acknowledged. This approach provides a new way to evaluate and constrain computational models with brain activity and pushes our understanding of human brain function further along the spectrum from strict modularity toward distributed representation.


2021 ◽  
Author(s):  
Jason J Moore ◽  
Jesse D Cushman ◽  
Lavanya Acharya ◽  
Mayank R Mehta

ABSTRACTThe hippocampus is implicated in episodic memory and allocentric spatial navigation. However, spatial selectivity is insufficient to navigate; one needs information about the distance and direction to the reward on a specific journey. The nature of these representations, whether they are expressed in an episodic-like sequence, and their relationship with navigational performance are unknown. We recorded single units from dorsal CA1 of the hippocampus while rats navigated to an unmarked reward zone defined solely by distal visual cues, similar to the classic water maze. The allocentric spatial selectivity was substantially weaker than in typical real world tasks, despite excellent navigational performance. Instead, the majority of cells encoded path distance from the start of trials. Cells also encoded the rat’s allocentric position and head angle. Often the same cells multiplexed and encoded path distance, head direction and allocentric position in a sequence, thus encoding a journey-specific episode. The strength of neural activity and tuning strongly correlated with performance, with a temporal relationship indicating neural responses influencing behavior and vice versa. Consistent with computational models of associative Hebbian learning, neural responses showed increasing clustering and became better predictors of behaviorally relevant variables, with neurometric curves exceeding and converging to psychometric curves. These findings demonstrate that hippocampal neurons multiplex and exhibit highly plastic, task- and experience-dependent tuning to path-centric and allocentric variables to form an episode, which could mediate navigation.


2020 ◽  
Vol 43 ◽  
Author(s):  
Martina G. Vilas ◽  
Lucia Melloni

Abstract To become a unifying theory of brain function, predictive processing (PP) must accommodate its rich representational diversity. Gilead et al. claim such diversity requires a multi-process theory, and thus is out of reach for PP, which postulates a universal canonical computation. We contend this argument and instead propose that PP fails to account for the experiential level of representations.


2020 ◽  
Author(s):  
Kun Sun

Expectations or predictions about upcoming content play an important role during language comprehension and processing. One important aspect of recent studies of language comprehension and processing concerns the estimation of the upcoming words in a sentence or discourse. Many studies have used eye-tracking data to explore computational and cognitive models for contextual word predictions and word processing. Eye-tracking data has previously been widely explored with a view to investigating the factors that influence word prediction. However, these studies are problematic on several levels, including the stimuli, corpora, statistical tools they applied. Although various computational models have been proposed for simulating contextual word predictions, past studies usually preferred to use a single computational model. The disadvantage of this is that it often cannot give an adequate account of cognitive processing in language comprehension. To avoid these problems, this study draws upon a massive natural and coherent discourse as stimuli in collecting the data on reading time. This study trains two state-of-art computational models (surprisal and semantic (dis)similarity from word vectors by linear discriminative learning (LDL)), measuring knowledge of both the syntagmatic and paradigmatic structure of language. We develop a `dynamic approach' to compute semantic (dis)similarity. It is the first time that these two computational models have been merged. Models are evaluated using advanced statistical methods. Meanwhile, in order to test the efficiency of our approach, one recently developed cosine method of computing semantic (dis)similarity based on word vectors data adopted is used to compare with our `dynamic' approach. The two computational and fixed-effect statistical models can be used to cross-verify the findings, thus ensuring that the result is reliable. All results support that surprisal and semantic similarity are opposed in the prediction of the reading time of words although both can make good predictions. Additionally, our `dynamic' approach performs better than the popular cosine method. The findings of this study are therefore of significance with regard to acquiring a better understanding how humans process words in a real-world context and how they make predictions in language cognition and processing.


Author(s):  
Jonathan E. Peelle

Language processing in older adulthood is a model of balance between preservation and decline. Despite widespread changes to physiological mechanisms supporting perception and cognition, older adults’ language abilities are frequently well preserved. At the same time, the neural systems engaged to achieve this high level of success change, and individual differences in neural organization appear to differentiate between more and less successful performers. This chapter reviews anatomical and cognitive changes that occur in aging and popular frameworks for age-related changes in brain function, followed by an examination of how these principles play out in the context of language comprehension and production.


2021 ◽  
Vol 7 (22) ◽  
pp. eabe7547
Author(s):  
Meenakshi Khosla ◽  
Gia H. Ngo ◽  
Keith Jamison ◽  
Amy Kuceyeski ◽  
Mert R. Sabuncu

Naturalistic stimuli, such as movies, activate a substantial portion of the human brain, invoking a response shared across individuals. Encoding models that predict neural responses to arbitrary stimuli can be very useful for studying brain function. However, existing models focus on limited aspects of naturalistic stimuli, ignoring the dynamic interactions of modalities in this inherently context-rich paradigm. Using movie-watching data from the Human Connectome Project, we build group-level models of neural activity that incorporate several inductive biases about neural information processing, including hierarchical processing, temporal assimilation, and auditory-visual interactions. We demonstrate how incorporating these biases leads to remarkable prediction performance across large areas of the cortex, beyond the sensory-specific cortices into multisensory sites and frontal cortex. Furthermore, we illustrate that encoding models learn high-level concepts that generalize to task-bound paradigms. Together, our findings underscore the potential of encoding models as powerful tools for studying brain function in ecologically valid conditions.


2021 ◽  
Vol 1 (1) ◽  
pp. 23-41
Author(s):  
Xi Jiang ◽  
Tuo Zhang ◽  
Shu Zhang ◽  
Keith M Kendrick ◽  
Tianming Liu

Abstract Folding of the cerebral cortex is a prominent characteristic of mammalian brains. Alterations or deficits in cortical folding are strongly correlated with abnormal brain function, cognition, and behavior. Therefore, a precise mapping between the anatomy and function of the brain is critical to our understanding of the mechanisms of brain structural architecture in both health and diseases. Gyri and sulci, the standard nomenclature for cortical anatomy, serve as building blocks to make up complex folding patterns, providing a window to decipher cortical anatomy and its relation with brain functions. Huge efforts have been devoted to this research topic from a variety of disciplines including genetics, cell biology, anatomy, neuroimaging, and neurology, as well as involving computational approaches based on machine learning and artificial intelligence algorithms. However, despite increasing progress, our understanding of the functional anatomy of gyro-sulcal patterns is still in its infancy. In this review, we present the current state of this field and provide our perspectives of the methodologies and conclusions concerning functional differentiation between gyri and sulci, as well as the supporting information from genetic, cell biology, and brain structure research. In particular, we will further present a proposed framework for attempting to interpret the dynamic mechanisms of the functional interplay between gyri and sulci. Hopefully, this review will provide a comprehensive summary of anatomo-functional relationships in the cortical gyro-sulcal system together with a consideration of how these contribute to brain function, cognition, and behavior, as well as to mental disorders.


2021 ◽  
pp. 1-7
Author(s):  
Rong Chen ◽  
Chongguang Ren

Domain adaptation aims to solve the problems of lacking labels. Most existing works of domain adaptation mainly focus on aligning the feature distributions between the source and target domain. However, in the field of Natural Language Processing, some of the words in different domains convey different sentiment. Thus not all features of the source domain should be transferred, and it would cause negative transfer when aligning the untransferable features. To address this issue, we propose a Correlation Alignment with Attention mechanism for unsupervised Domain Adaptation (CAADA) model. In the model, an attention mechanism is introduced into the transfer process for domain adaptation, which can capture the positively transferable features in source and target domain. Moreover, the CORrelation ALignment (CORAL) loss is utilized to minimize the domain discrepancy by aligning the second-order statistics of the positively transferable features extracted by the attention mechanism. Extensive experiments on the Amazon review dataset demonstrate the effectiveness of CAADA method.


Sign in / Sign up

Export Citation Format

Share Document