Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

The emotion cause extraction (ECE) task aims at discovering the potential causes behind a certain emotion expression in a document. Techniques including rule-based methods, traditional machine learning methods and deep neural networks have been proposed to solve this task. However, most of the previous work considered ECE as a set of independent clause classification problems and ignored the relations between multiple clauses in a document. In this work, we propose a joint emotion cause extraction framework, named RNN-Transformer Hierarchical Network (RTHN), to encode and classify multiple clauses synchronously. RTHN is composed of a lower word-level encoder based on RNNs to encode multiple words in each clause, and an upper clause-level encoder based on Transformer to learn the correlation between multiple clauses in a document. We furthermore propose ways to encode the relative position and global predication information into Transformer that can capture the causality between clauses and make RTHN more efficient. We finally achieve the best performance among 12 compared systems and improve the F1 score of the state-of-the-art from 72.69% to 76.77%.

Download Full-text

Linguistic knowledge-based vocabularies for Neural Machine Translation

Natural Language Engineering ◽

10.1017/s1351324920000364 ◽

2020 ◽

pp. 1-22

Author(s):

Noe Casas ◽

Marta R. Costa-jussà ◽

José A. R. Fonollosa ◽

Juan A. Alonso ◽

Ramón Fanlo

Keyword(s):

Neural Networks ◽

Machine Translation ◽

Word Formation ◽

Linguistic Knowledge ◽

Learning Approaches ◽

Neural Machine Translation ◽

Knowledge Based ◽

Word Level ◽

The Neural Network ◽

Cross Lingual

Abstract Neural Networks applied to Machine Translation need a finite vocabulary to express textual information as a sequence of discrete tokens. The currently dominant subword vocabularies exploit statistically-discovered common parts of words to achieve the flexibility of character-based vocabularies without delegating the whole learning of word formation to the neural network. However, they trade this for the inability to apply word-level token associations, which limits their use in semantically-rich areas and prevents some transfer learning approaches e.g. cross-lingual pretrained embeddings, and reduces their interpretability. In this work, we propose new hybrid linguistically-grounded vocabulary definition strategies that keep both the advantages of subword vocabularies and the word-level associations, enabling neural networks to profit from the derived benefits. We test the proposed approaches in both morphologically rich and poor languages, showing that, for the former, the quality in the translation of out-of-domain texts is improved with respect to a strong subword baseline.

Download Full-text

Discriminating between Similar Languages with Word-level Convolutional Neural Networks

10.18653/v1/w17-1215 ◽

2017 ◽

Cited By ~ 1

Author(s):

Marcelo Criscuolo ◽

Sandra Maria Aluisio

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Word Level

Download Full-text

Tabula Nearly Rasa: Probing the Linguistic Knowledge of Character-level Neural Language Models Trained on Unsegmented Text

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00283 ◽

2019 ◽

Vol 7 ◽

pp. 467-484 ◽

Cited By ~ 1

Author(s):

Michael Hahn ◽

Marco Baroni

Keyword(s):

Neural Networks ◽

Language Learning ◽

Language Processing ◽

Recurrent Neural Networks ◽

Input Data ◽

Language Models ◽

Linguistic Knowledge ◽

Word Level ◽

Linguistic Unit ◽

Word Boundaries

Recurrent neural networks (RNNs) have reached striking performance in many natural language processing tasks. This has renewed interest in whether these generic sequence processing devices are inducing genuine linguistic knowledge. Nearly all current analytical studies, however, initialize the RNNs with a vocabulary of known words, and feed them tokenized input during training. We present a multi-lingual study of the linguistic knowledge encoded in RNNs trained as character-level language models, on input data with word boundaries removed. These networks face a tougher and more cognitively realistic task, having to discover any useful linguistic unit from scratch based on input statistics. The results show that our “near tabula rasa” RNNs are mostly able to solve morphological, syntactic and semantic tasks that intuitively presuppose word-level knowledge, and indeed they learned, to some extent, to track word boundaries. Our study opens the door to speculations about the necessity of an explicit, rigid word lexicon in language learning and usage.

Download Full-text

Distinct Neural Networks Relate to Common and Speaker-Specific Language Priors

Cerebral Cortex Communications ◽

10.1093/texcom/tgaa021 ◽

2020 ◽

Vol 1 (1) ◽

Author(s):

Leon O H Kroczek ◽

Thomas C Gunter

Keyword(s):

Neural Networks ◽

Language Use ◽

Brain Regions ◽

Syntactic Complexity ◽

Top Down ◽

Specific Language ◽

Linguistic Rules ◽

Speaker Information ◽

Specific Individual ◽

General Linguistic

Abstract Effective natural communication requires listeners to incorporate not only very general linguistic principles which evolved during a lifetime but also other information like the specific individual language use of a particular interlocutor. Traditionally, research has focused on the general linguistic rules, and brain science has shown a left hemispheric fronto-temporal brain network related to this processing. The present fMRI research explores speaker-specific individual language use because it is unknown whether this processing is supported by similar or distinct neural structures. Twenty-eight participants listened to sentences of persons who used more easy or difficult language. This was done by manipulating the proportion of easy SOV vs. complex OSV sentences for each speaker. Furthermore, ambiguous probe sentences were included to test top-down influences of speaker information in the absence of syntactic structure information. We observed distinct neural processing for syntactic complexity and speaker-specific language use. Syntactic complexity correlated with left frontal and posterior temporal regions. Speaker-specific processing correlated with bilateral (right-dominant) fronto-parietal brain regions. Finally, the top-down influence of speaker information was found in frontal and striatal brain regions, suggesting a mechanism for controlled syntactic processing. These findings show distinct neural networks related to general language principles as well as speaker-specific individual language use.

Download Full-text