Simulating the emergence of the Visual Word Form Area: Recycling a convolutional neural network for reading

AbstractThe visual word form area (VWFA) is a region of human inferotemporal cortex that emerges at a fixed location in occipitotemporal cortex during reading acquisition, and systematically responds to written words in literate individuals. According to the neuronal recycling hypothesis, this region arises through the repurposing, for letter recognition, of a subpart of the ventral visual pathway initially involved in face and object recognition. Furthermore, according to the biased connectivity hypothesis, its universal localization is due to pre-existing connections from this subregion to areas involved in spoken language processing. Here, we evaluate those hypotheses in an explicit computational model. We trained a deep convolutional neural network of the ventral visual pathway, first to categorize pictures, and then to recognize written words invariantly for case, font and size. We show that the model can account for many properties of the VWFA, particularly when a subset of units possesses a biased connectivity to word output units. The network develops a sparse, invariant representation of written words, based on a restricted set of reading-selective units. Their activation mimics several properties of the VWFA, and their lesioning causes a reading-specific deficit. Our simulation fleshes out the neuronal recycling hypothesis, and make several testable predictions concerning the neural code for written words.

Download Full-text

Emergence of a compositional neural code for written words: Recycling of a convolutional neural network for reading

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2104779118 ◽

2021 ◽

Vol 118 (46) ◽

pp. e2104779118

Author(s):

T. Hannagan ◽

A. Agrawal ◽

L. Cohen ◽

S. Dehaene

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Language Processing ◽

Visual Pathway ◽

Neural Code ◽

Letter Recognition ◽

Invariant Representation ◽

Spoken Language Processing ◽

Visual Word Form Area ◽

Ventral Visual Pathway

The visual word form area (VWFA) is a region of human inferotemporal cortex that emerges at a fixed location in the occipitotemporal cortex during reading acquisition and systematically responds to written words in literate individuals. According to the neuronal recycling hypothesis, this region arises through the repurposing, for letter recognition, of a subpart of the ventral visual pathway initially involved in face and object recognition. Furthermore, according to the biased connectivity hypothesis, its reproducible localization is due to preexisting connections from this subregion to areas involved in spoken-language processing. Here, we evaluate those hypotheses in an explicit computational model. We trained a deep convolutional neural network of the ventral visual pathway, first to categorize pictures and then to recognize written words invariantly for case, font, and size. We show that the model can account for many properties of the VWFA, particularly when a subset of units possesses a biased connectivity to word output units. The network develops a sparse, invariant representation of written words, based on a restricted set of reading-selective units. Their activation mimics several properties of the VWFA, and their lesioning causes a reading-specific deficit. The model predicts that, in literate brains, written words are encoded by a compositional neural code with neurons tuned either to individual letters and their ordinal position relative to word start or word ending or to pairs of letters (bigrams).

Download Full-text

Modeling categorical search guidance using a convolutional neural network designed after the ventral visual pathway

Journal of Vision ◽

10.1167/17.10.88 ◽

2017 ◽

Vol 17 (10) ◽

pp. 88

Author(s):

Gregory Zelinsky ◽

Chen-Ping Yu

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Visual Pathway ◽

Ventral Visual Pathway ◽

Search Guidance

Download Full-text

Linguistic experience acquisition for novel stimuli selectively activates the neural network of the visual word form area

NeuroImage ◽

10.1016/j.neuroimage.2020.116838 ◽

2020 ◽

Vol 215 ◽

pp. 116838 ◽

Cited By ~ 2

Author(s):

Mingyang Li ◽

Yangwen Xu ◽

Xiangqi Luo ◽

Jiahong Zeng ◽

Zaizhu Han

Keyword(s):

Neural Network ◽

Visual Word ◽

Word Form ◽

Linguistic Experience ◽

Visual Word Form Area ◽

The Neural Network

Download Full-text

Functionally Separable Font-invariant and Font-sensitive Neural Populations in Occipitotemporal Cortex

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01408 ◽

2019 ◽

Vol 31 (7) ◽

pp. 1018-1029 ◽

Cited By ~ 2

Author(s):

Zhiheng Zhou ◽

Tutis Vilis ◽

Lars Strother

Keyword(s):

Visual Word ◽

Neural Mechanisms ◽

Word Form ◽

Invariant Representation ◽

Visual Word Form Area ◽

Repetition Suppression ◽

Neural Populations ◽

Occipitotemporal Cortex ◽

Middle Occipital Gyrus ◽

Repeated Words

Reading relies on the rapid visual recognition of words viewed in a wide variety of fonts. We used fMRI to identify neural populations showing reduced fMRI responses to repeated words displayed in different fonts (“font-invariant” repetition suppression). We also identified neural populations showing greater fMRI responses to words repeated in a changing font as compared with words repeated in the same font (“font-sensitive” release from repetition suppression). We observed font-invariant repetition suppression in two anatomically distinct regions of the left occipitotemporal cortex (OT), a “visual word form area” in mid-fusiform cortex, and a more posterior region in the middle occipital gyrus. In contrast, bilateral shape-selective lateral occipital cortex and posterior fusiform showed considerable sensitivity to font changes during the viewing of repeated words. Although the visual word form area and the left middle occipital gyrus showed some evidence of font sensitivity, both regions showed a relatively greater degree of font invariance than font sensitivity. Our results show that the neural mechanisms in the left OT involved in font-invariant word recognition are anatomically distinct from those sensitive to font-related shape changes. We conclude that font-invariant representation of visual word form is instantiated at multiple levels by anatomically distinct neural mechanisms within the left OT.

Download Full-text

Modelling attention control using a convolutional neural network designed after the ventral visual pathway

Visual Cognition ◽

10.1080/13506285.2019.1661927 ◽

2019 ◽

Vol 27 (5-8) ◽

pp. 416-434

Author(s):

Chen-Ping Yu ◽

Huidong Liu ◽

Dimitrios Samaras ◽

Gregory J. Zelinsky

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Visual Pathway ◽

Attention Control ◽

Ventral Visual Pathway

Download Full-text

Language Prediction Is Reflected by Coupling between Frontal Gamma and Posterior Alpha Oscillations

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01190 ◽

2018 ◽

Vol 30 (3) ◽

pp. 432-447 ◽

Cited By ~ 21

Author(s):

Lin Wang ◽

Peter Hagoort ◽

Ole Jensen

Keyword(s):

Language Processing ◽

Visual Word ◽

Temporal Region ◽

Alpha Power ◽

Word Form ◽

Visual Word Form Area ◽

Gamma Power ◽

Temporal Language ◽

Study Participants ◽

Language Network

Readers and listeners actively predict upcoming words during language processing. These predictions might serve to support the unification of incoming words into sentence context and thus rely on interactions between areas in the language network. In the current magnetoencephalography study, participants read sentences that varied in contextual constraints so that the predictability of the sentence-final words was either high or low. Before the sentence-final words, we observed stronger alpha power suppression for the highly compared with low constraining sentences in the left inferior frontal cortex, left posterior temporal region, and visual word form area. Importantly, the temporal and visual word form area alpha power correlated negatively with left frontal gamma power for the highly constraining sentences. We suggest that the correlation between alpha power decrease in temporal language areas and left prefrontal gamma power reflects the initiation of an anticipatory unification process in the language network.

Download Full-text

Modeling Attention Control Using A Convolutional Neural Network Designed After The Ventral Visual Pathway

10.1101/473124 ◽

2018 ◽

Author(s):

Chen-Ping Yu ◽

Huidong Liu ◽

Dimitris Samaras ◽

Gregory Zelinsky

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Visual Pathway ◽

Attention Control ◽

Directed Attention ◽

Object Categories ◽

Control Signals ◽

Image Patches ◽

Ventral Visual Pathway ◽

Subordinate Level

AbstractRecently we proposed that people represent object categories using category-consistent features (CCFs), those features that occur both frequently and consistently across a categorys exemplars [70]. Here we designed a Convolutional Neural Network (CNN) after the primate ventral stream (VsNet) and used it to extract CCFs from 68 categories of objects spanning a three-level category hierarchy. We evaluated VsNet against people searching for the same targets from the same 68 categories. Not only did VsNet replicate our previous report of stronger attention guidance to subordinate-level targets, with its more powerful CNN-CCFs it was able to predict attention control to individual target categories–the more CNN-CCFs extracted for a category, the faster gaze was directed to the target. We also probed VsNet to determine where in its network of layers these attention control signals originate. We found that CCFs extracted from VsNet’s V1 layer contributed most to guiding attention to targets cued at the subordinate (e.g., police car) and basic (e.g., car) levels, but that guidance to superordinate-cued (e.g., vehicle) targets was strongest using CCFs from the CIT+AIT layer. We also identified the image patches eliciting the strongest filter responses from areas V4 and higher and found that they depicted representative parts of an object category (e.g., advertisements appearing on top of taxi cabs). Finally, we found that VsNet better predicted attention control than comparable CNN models, despite having fewer convolutional filters. This work shows that a brain-inspired CNN can predict goal-directed attention control by extracting and using category-consistent features.

Download Full-text

Research on Inversion Mechanism of Chlorophyll—A Concentration in Water Bodies Using a Convolutional Neural Network Model

Water ◽

10.3390/w13050664 ◽

2021 ◽

Vol 13 (5) ◽

pp. 664

Author(s):

Yun Xue ◽

Lei Zhu ◽

Bin Zou ◽

Yi-min Wen ◽

Yue-hong Long ◽

...

Keyword(s):

Neural Network ◽

Regression Model ◽

Convolutional Neural Network ◽

Chlorophyll A ◽

Language Processing ◽

Water Bodies ◽

Inversion Effect ◽

Least Squares Regression ◽

Chlorophyll A Concentration ◽

Chl A

For Case-II water bodies with relatively complex water qualities, it is challenging to establish a chlorophyll-a concentration (Chl-a concentration) inversion model with strong applicability and high accuracy. Convolutional Neural Network (CNN) shows excellent performance in image target recognition and natural language processing. However, there little research exists on the inversion of Chl-a concentration in water using convolutional neural networks. Taking China’s Dongting Lake as an example, 90 water samples and their spectra were collected in this study. Using eight combinations as independent variables and Chl-a concentration as the dependent variable, a CNN model was constructed to invert Chl-a concentration. The results showed that: (1) The CNN model of the original spectrum has a worse inversion effect than the CNN model of the preprocessed spectrum. The determination coefficient (RP2) of the predicted sample is increased from 0.79 to 0.88, and the root mean square error (RMSEP) of the predicted sample is reduced from 0.61 to 0.49, indicating that preprocessing can significantly improve the inversion effect of the model.; (2) among the combined models, the CNN model with Baseline1_SC (strong correlation factor of 500–750 nm baseline) has the best effect, with RP2 reaching 0.90 and RMSEP only 0.45. The average inversion effect of the eight CNN models is better. The average RP2 reaches 0.86 and the RMSEP is only 0.52, indicating the feasibility of applying CNN to Chl-a concentration inversion modeling; (3) the performance of the CNN model (Baseline1_SC (RP2 = 0.90, RMSEP = 0.45)) was far better than the traditional model of the same combination, i.e., the linear regression model (RP2 = 0.61, RMSEP = 0.72) and partial least squares regression model (Baseline1_SC (RP2 = 0.58. RMSEP = 0.95)), indicating the superiority of the convolutional neural network inversion modeling of water body Chl-a concentration.

Download Full-text

Neurodegeneration of the visual word form area in a patient with word form alexia

Neurology and Clinical Neuroscience ◽

10.1111/ncn3.12516 ◽

2021 ◽

Author(s):

Adithya Chandregowda ◽

Joseph R. Duffy ◽

Mary M. Machulda ◽

Val J. Lowe ◽

Jennifer L. Whitwell ◽

...

Keyword(s):

Visual Word ◽

Word Form ◽

Visual Word Form Area

Download Full-text

Towards Accurate Deceptive Opinions Detection Based on Word Order-Preserving CNN

Mathematical Problems in Engineering ◽

10.1155/2018/2410206 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 4

Author(s):

Siyuan Zhao ◽

Zhiwei Xu ◽

Limin Liu ◽

Mengjie Guo ◽

Jing Yun

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Convolutional Neural Network ◽

Language Processing ◽

Word Order ◽

Text Analysis ◽

Important Application ◽

Detection Mechanism ◽

Short Text

Convolutional neural network (CNN) has revolutionized the field of natural language processing, which is considerably efficient at semantics analysis that underlies difficult natural language processing problems in a variety of domains. The deceptive opinion detection is an important application of the existing CNN models. The detection mechanism based on CNN models has better self-adaptability and can effectively identify all kinds of deceptive opinions. Online opinions are quite short, varying in their types and content. In order to effectively identify deceptive opinions, we need to comprehensively study the characteristics of deceptive opinions and explore novel characteristics besides the textual semantics and emotional polarity that have been widely used in text analysis. In this paper, we optimize the convolutional neural network model by embedding the word order characteristics in its convolution layer and pooling layer, which makes convolutional neural network more suitable for short text classification and deceptive opinions detection. The TensorFlow-based experiments demonstrate that the proposed detection mechanism achieves more accurate deceptive opinion detection results.

Download Full-text