Emergence of a compositional neural code for written words: Recycling of a convolutional neural network for reading

2021 ◽  
Vol 118 (46) ◽  
pp. e2104779118
Author(s):  
T. Hannagan ◽  
A. Agrawal ◽  
L. Cohen ◽  
S. Dehaene

The visual word form area (VWFA) is a region of human inferotemporal cortex that emerges at a fixed location in the occipitotemporal cortex during reading acquisition and systematically responds to written words in literate individuals. According to the neuronal recycling hypothesis, this region arises through the repurposing, for letter recognition, of a subpart of the ventral visual pathway initially involved in face and object recognition. Furthermore, according to the biased connectivity hypothesis, its reproducible localization is due to preexisting connections from this subregion to areas involved in spoken-language processing. Here, we evaluate those hypotheses in an explicit computational model. We trained a deep convolutional neural network of the ventral visual pathway, first to categorize pictures and then to recognize written words invariantly for case, font, and size. We show that the model can account for many properties of the VWFA, particularly when a subset of units possesses a biased connectivity to word output units. The network develops a sparse, invariant representation of written words, based on a restricted set of reading-selective units. Their activation mimics several properties of the VWFA, and their lesioning causes a reading-specific deficit. The model predicts that, in literate brains, written words are encoded by a compositional neural code with neurons tuned either to individual letters and their ordinal position relative to word start or word ending or to pairs of letters (bigrams).

2021 ◽  
Author(s):  
T. Hannagan ◽  
A. Agrawal ◽  
L. Cohen ◽  
S. Dehaene

AbstractThe visual word form area (VWFA) is a region of human inferotemporal cortex that emerges at a fixed location in occipitotemporal cortex during reading acquisition, and systematically responds to written words in literate individuals. According to the neuronal recycling hypothesis, this region arises through the repurposing, for letter recognition, of a subpart of the ventral visual pathway initially involved in face and object recognition. Furthermore, according to the biased connectivity hypothesis, its universal localization is due to pre-existing connections from this subregion to areas involved in spoken language processing. Here, we evaluate those hypotheses in an explicit computational model. We trained a deep convolutional neural network of the ventral visual pathway, first to categorize pictures, and then to recognize written words invariantly for case, font and size. We show that the model can account for many properties of the VWFA, particularly when a subset of units possesses a biased connectivity to word output units. The network develops a sparse, invariant representation of written words, based on a restricted set of reading-selective units. Their activation mimics several properties of the VWFA, and their lesioning causes a reading-specific deficit. Our simulation fleshes out the neuronal recycling hypothesis, and make several testable predictions concerning the neural code for written words.


2019 ◽  
Vol 27 (5-8) ◽  
pp. 416-434
Author(s):  
Chen-Ping Yu ◽  
Huidong Liu ◽  
Dimitrios Samaras ◽  
Gregory J. Zelinsky

2018 ◽  
Author(s):  
Chen-Ping Yu ◽  
Huidong Liu ◽  
Dimitris Samaras ◽  
Gregory Zelinsky

AbstractRecently we proposed that people represent object categories using category-consistent features (CCFs), those features that occur both frequently and consistently across a categorys exemplars [70]. Here we designed a Convolutional Neural Network (CNN) after the primate ventral stream (VsNet) and used it to extract CCFs from 68 categories of objects spanning a three-level category hierarchy. We evaluated VsNet against people searching for the same targets from the same 68 categories. Not only did VsNet replicate our previous report of stronger attention guidance to subordinate-level targets, with its more powerful CNN-CCFs it was able to predict attention control to individual target categories–the more CNN-CCFs extracted for a category, the faster gaze was directed to the target. We also probed VsNet to determine where in its network of layers these attention control signals originate. We found that CCFs extracted from VsNet’s V1 layer contributed most to guiding attention to targets cued at the subordinate (e.g., police car) and basic (e.g., car) levels, but that guidance to superordinate-cued (e.g., vehicle) targets was strongest using CCFs from the CIT+AIT layer. We also identified the image patches eliciting the strongest filter responses from areas V4 and higher and found that they depicted representative parts of an object category (e.g., advertisements appearing on top of taxi cabs). Finally, we found that VsNet better predicted attention control than comparable CNN models, despite having fewer convolutional filters. This work shows that a brain-inspired CNN can predict goal-directed attention control by extracting and using category-consistent features.


Water ◽  
2021 ◽  
Vol 13 (5) ◽  
pp. 664
Author(s):  
Yun Xue ◽  
Lei Zhu ◽  
Bin Zou ◽  
Yi-min Wen ◽  
Yue-hong Long ◽  
...  

For Case-II water bodies with relatively complex water qualities, it is challenging to establish a chlorophyll-a concentration (Chl-a concentration) inversion model with strong applicability and high accuracy. Convolutional Neural Network (CNN) shows excellent performance in image target recognition and natural language processing. However, there little research exists on the inversion of Chl-a concentration in water using convolutional neural networks. Taking China’s Dongting Lake as an example, 90 water samples and their spectra were collected in this study. Using eight combinations as independent variables and Chl-a concentration as the dependent variable, a CNN model was constructed to invert Chl-a concentration. The results showed that: (1) The CNN model of the original spectrum has a worse inversion effect than the CNN model of the preprocessed spectrum. The determination coefficient (RP2) of the predicted sample is increased from 0.79 to 0.88, and the root mean square error (RMSEP) of the predicted sample is reduced from 0.61 to 0.49, indicating that preprocessing can significantly improve the inversion effect of the model.; (2) among the combined models, the CNN model with Baseline1_SC (strong correlation factor of 500–750 nm baseline) has the best effect, with RP2 reaching 0.90 and RMSEP only 0.45. The average inversion effect of the eight CNN models is better. The average RP2 reaches 0.86 and the RMSEP is only 0.52, indicating the feasibility of applying CNN to Chl-a concentration inversion modeling; (3) the performance of the CNN model (Baseline1_SC (RP2 = 0.90, RMSEP = 0.45)) was far better than the traditional model of the same combination, i.e., the linear regression model (RP2 = 0.61, RMSEP = 0.72) and partial least squares regression model (Baseline1_SC (RP2 = 0.58. RMSEP = 0.95)), indicating the superiority of the convolutional neural network inversion modeling of water body Chl-a concentration.


2018 ◽  
Vol 2018 ◽  
pp. 1-9 ◽  
Author(s):  
Siyuan Zhao ◽  
Zhiwei Xu ◽  
Limin Liu ◽  
Mengjie Guo ◽  
Jing Yun

Convolutional neural network (CNN) has revolutionized the field of natural language processing, which is considerably efficient at semantics analysis that underlies difficult natural language processing problems in a variety of domains. The deceptive opinion detection is an important application of the existing CNN models. The detection mechanism based on CNN models has better self-adaptability and can effectively identify all kinds of deceptive opinions. Online opinions are quite short, varying in their types and content. In order to effectively identify deceptive opinions, we need to comprehensively study the characteristics of deceptive opinions and explore novel characteristics besides the textual semantics and emotional polarity that have been widely used in text analysis. In this paper, we optimize the convolutional neural network model by embedding the word order characteristics in its convolution layer and pooling layer, which makes convolutional neural network more suitable for short text classification and deceptive opinions detection. The TensorFlow-based experiments demonstrate that the proposed detection mechanism achieves more accurate deceptive opinion detection results.


2020 ◽  
Author(s):  
Haider Al-Tahan ◽  
Yalda Mohsenzadeh

AbstractWhile vision evokes a dense network of feedforward and feedback neural processes in the brain, visual processes are primarily modeled with feedforward hierarchical neural networks, leaving the computational role of feedback processes poorly understood. Here, we developed a generative autoencoder neural network model and adversarially trained it on a categorically diverse data set of images. We hypothesized that the feedback processes in the ventral visual pathway can be represented by reconstruction of the visual information performed by the generative model. We compared representational similarity of the activity patterns in the proposed model with temporal (magnetoencephalography) and spatial (functional magnetic resonance imaging) visual brain responses. The proposed generative model identified two segregated neural dynamics in the visual brain. A temporal hierarchy of processes transforming low level visual information into high level semantics in the feedforward sweep, and a temporally later dynamics of inverse processes reconstructing low level visual information from a high level latent representation in the feedback sweep. Our results append to previous studies on neural feedback processes by presenting a new insight into the algorithmic function and the information carried by the feedback processes in the ventral visual pathway.Author summaryIt has been shown that the ventral visual cortex consists of a dense network of regions with feedforward and feedback connections. The feedforward path processes visual inputs along a hierarchy of cortical areas that starts in early visual cortex (an area tuned to low level features e.g. edges/corners) and ends in inferior temporal cortex (an area that responds to higher level categorical contents e.g. faces/objects). Alternatively, the feedback connections modulate neuronal responses in this hierarchy by broadcasting information from higher to lower areas. In recent years, deep neural network models which are trained on object recognition tasks achieved human-level performance and showed similar activation patterns to the visual brain. In this work, we developed a generative neural network model that consists of encoding and decoding sub-networks. By comparing this computational model with the human brain temporal (magnetoencephalography) and spatial (functional magnetic resonance imaging) response patterns, we found that the encoder processes resemble the brain feedforward processing dynamics and the decoder shares similarity with the brain feedback processing dynamics. These results provide an algorithmic insight into the spatiotemporal dynamics of feedforward and feedback processes in biological vision.


2022 ◽  
pp. 155-170
Author(s):  
Lap-Kei Lee ◽  
Kwok Tai Chui ◽  
Jingjing Wang ◽  
Yin-Chun Fung ◽  
Zhanhui Tan

The dependence on Internet in our daily life is ever-growing, which provides opportunity to discover valuable and subjective information using advanced techniques such as natural language processing and artificial intelligence. In this chapter, the research focus is a convolutional neural network for three-class (positive, neutral, and negative) cross-domain sentiment analysis. The model is enhanced in two-fold. First, a similarity label method facilitates the management between the source and target domains to generate more labelled data. Second, term frequency-inverse document frequency (TF-IDF) and latent semantic indexing (LSI) are employed to compute the similarity between source and target domains. Performance evaluation is conducted using three datasets, beauty reviews, toys reviews, and phone reviews. The proposed method enhances the accuracy by 4.3-7.6% and reduces the training time by 50%. The limitations of the research work have been discussed, which serve as the rationales of future research directions.


2020 ◽  
pp. 1058-1071
Author(s):  
D. T. Mane ◽  
U. V. Kulkarni

With the advances in the computer science field, various new data science techniques have been emerged. Convolutional Neural Network (CNN) is one of the Deep Learning techniques which have captured lots of attention as far as real world applications are considered. It is nothing but the multilayer architecture with hidden computational power which detects features itself. It doesn't require any handcrafted features. The remarkable increase in the computational power of Convolutional Neural Network is due to the use of Graphics processor units, parallel computing, also the availability of large amount of data in various variety forms. This paper gives the broad view of various supervised Convolutional Neural Network applications with its salient features in the fields, mainly Computer vision for Pattern and Object Detection, Natural Language Processing, Speech Recognition, Medical image analysis.


Sign in / Sign up

Export Citation Format

Share Document