scholarly journals Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

Author(s):  
Fréderic Godin ◽  
Kris Demuynck ◽  
Joni Dambre ◽  
Wesley De Neve ◽  
Thomas Demeester
Author(s):  
Rui Xia ◽  
Mengran Zhang ◽  
Zixiang Ding

The emotion cause extraction (ECE) task aims at discovering the potential causes behind a certain emotion expression in a document. Techniques including rule-based methods, traditional machine learning methods and deep neural networks have been proposed to solve this task. However, most of the previous work considered ECE as a set of independent clause classification problems and ignored the relations between multiple clauses in a document. In this work, we propose a joint emotion cause extraction framework, named RNN-Transformer Hierarchical Network (RTHN), to encode and classify multiple clauses synchronously. RTHN is composed of a lower word-level encoder based on RNNs to encode multiple words in each clause, and an upper clause-level encoder based on Transformer to learn the correlation between multiple clauses in a document. We furthermore propose ways to encode the relative position and global predication information into Transformer that can capture the causality between clauses and make RTHN more efficient. We finally achieve the best performance among 12 compared systems and improve the F1 score of the state-of-the-art from 72.69% to 76.77%.


2020 ◽  
pp. 1-22
Author(s):  
Noe Casas ◽  
Marta R. Costa-jussà ◽  
José A. R. Fonollosa ◽  
Juan A. Alonso ◽  
Ramón Fanlo

Abstract Neural Networks applied to Machine Translation need a finite vocabulary to express textual information as a sequence of discrete tokens. The currently dominant subword vocabularies exploit statistically-discovered common parts of words to achieve the flexibility of character-based vocabularies without delegating the whole learning of word formation to the neural network. However, they trade this for the inability to apply word-level token associations, which limits their use in semantically-rich areas and prevents some transfer learning approaches e.g. cross-lingual pretrained embeddings, and reduces their interpretability. In this work, we propose new hybrid linguistically-grounded vocabulary definition strategies that keep both the advantages of subword vocabularies and the word-level associations, enabling neural networks to profit from the derived benefits. We test the proposed approaches in both morphologically rich and poor languages, showing that, for the former, the quality in the translation of out-of-domain texts is improved with respect to a strong subword baseline.


2019 ◽  
Vol 7 ◽  
pp. 467-484 ◽  
Author(s):  
Michael Hahn ◽  
Marco Baroni

Recurrent neural networks (RNNs) have reached striking performance in many natural language processing tasks. This has renewed interest in whether these generic sequence processing devices are inducing genuine linguistic knowledge. Nearly all current analytical studies, however, initialize the RNNs with a vocabulary of known words, and feed them tokenized input during training. We present a multi-lingual study of the linguistic knowledge encoded in RNNs trained as character-level language models, on input data with word boundaries removed. These networks face a tougher and more cognitively realistic task, having to discover any useful linguistic unit from scratch based on input statistics. The results show that our “near tabula rasa” RNNs are mostly able to solve morphological, syntactic and semantic tasks that intuitively presuppose word-level knowledge, and indeed they learned, to some extent, to track word boundaries. Our study opens the door to speculations about the necessity of an explicit, rigid word lexicon in language learning and usage.


2020 ◽  
Vol 1 (1) ◽  
Author(s):  
Leon O H Kroczek ◽  
Thomas C Gunter

Abstract Effective natural communication requires listeners to incorporate not only very general linguistic principles which evolved during a lifetime but also other information like the specific individual language use of a particular interlocutor. Traditionally, research has focused on the general linguistic rules, and brain science has shown a left hemispheric fronto-temporal brain network related to this processing. The present fMRI research explores speaker-specific individual language use because it is unknown whether this processing is supported by similar or distinct neural structures. Twenty-eight participants listened to sentences of persons who used more easy or difficult language. This was done by manipulating the proportion of easy SOV vs. complex OSV sentences for each speaker. Furthermore, ambiguous probe sentences were included to test top-down influences of speaker information in the absence of syntactic structure information. We observed distinct neural processing for syntactic complexity and speaker-specific language use. Syntactic complexity correlated with left frontal and posterior temporal regions. Speaker-specific processing correlated with bilateral (right-dominant) fronto-parietal brain regions. Finally, the top-down influence of speaker information was found in frontal and striatal brain regions, suggesting a mechanism for controlled syntactic processing. These findings show distinct neural networks related to general language principles as well as speaker-specific individual language use.


Sign in / Sign up

Export Citation Format

Share Document