scholarly journals On Losses for Modern Language Models

Author(s):  
Stéphane Aroca-Ouellette ◽  
Frank Rudzicz
2021 ◽  
Vol 72 ◽  
pp. 1343-1384
Author(s):  
Vassilina Nikoulina ◽  
Maxat Tezekbayev ◽  
Nuradil Kozhakhmet ◽  
Madina Babazhanova ◽  
Matthias Gallé ◽  
...  

There is an ongoing debate in the NLP community whether modern language models contain linguistic knowledge, recovered through so-called probes. In this paper, we study whether linguistic knowledge is a necessary condition for the good performance of modern language models, which we call the rediscovery hypothesis. In the first place, we show that language models that are significantly compressed but perform well on their pretraining objectives retain good scores when probed for linguistic structures. This result supports the rediscovery hypothesis and leads to the second contribution of our paper: an information-theoretic framework that relates language modeling objectives with linguistic information. This framework also provides a metric to measure the impact of linguistic information on the word prediction task. We reinforce our analytical results with various experiments, both on synthetic and on real NLP tasks in English.


2020 ◽  
Author(s):  
Charlotte Caucheteux ◽  
Jean-Rémi King

AbstractDeep learning has recently allowed substantial progress in language tasks such as translation and completion. Do such models process language similarly to humans, and is this similarity driven by systematic structural, functional and learning principles? To address these issues, we tested whether the activations of 7,400 artificial neural networks trained on image, word and sentence processing linearly map onto the hierarchy of human brain responses elicited during a reading task, using source-localized magneto-encephalography (MEG) recordings of one hundred and four subjects. Our results confirm that visual, word and language models sequentially correlate with distinct areas of the left-lateralized cortical hierarchy of reading. However, only specific subsets of these models converge towards brain-like representations during their training. Specifically, when the algorithms are trained on language modeling, their middle layers become increasingly similar to the late responses of the language network in the brain. By contrast, input and output word embedding layers often diverge away from brain activity during training. These differences are primarily rooted in the sustained and bilateral responses of the temporal and frontal cortices. Together, these results suggest that the compositional - but not the lexical - representations of modern language models converge to a brain-like solution.


Author(s):  
Alexander Meduna ◽  
Ondřej Soukup

PMLA ◽  
1935 ◽  
Vol 50 (4) ◽  
pp. 1343-1343

The fifty-second meeting of the Modern Language Associationof America was held, on the invitation of the University of Cincinnati, at Cincinnati, Ohio, Monday, Tuesday, and Wednesday, December 30 and 31, 1935, and January 1, 1936. The Association headquarters were in the Netherland Plaza Hotel, where all meetings were held except those of Tuesday morning and afternoon. These took place at the University of Cincinnati. Registration cards at headquarters were signed by about 900, though a considerably larger number of members were in attendance. The Local Committee estimated the attendance at not less than 1400. This Committee consisted of Professor Frank W. Chandler, Chairman; Professor Edwin H. Zeydel; Professor Phillip Ogden; Mr. John J. Rowe (for the Directors); and Mr. Joseph S. Graydon (for the Alumni).


2020 ◽  
Vol 13 (2) ◽  
pp. 189-210
Author(s):  
Artemis Alexiadou

This paper discusses the formation of synthetic compounds with proper names. While these are possible in English, Greek disallows such formations. However, earlier stages of the language allowed such compounds, and in the modern language formations of this type are possible as long as they contain heads that are either bound roots or root- derived nominals of Classical Greek origin. The paper builds on the following ingredients: a) proper names are phrases; b) synthetic compounding in Modern Greek involves incorporation, and thus proper names cannot incorporate; c) by contrast, English synthetic compounds involve phrasal movement, and thus proper names can appear within compounds in this language. It is shown that in earlier Greek, proper names had the same status as their English counterparts, hence the possibility of synthetic compounds with proper names. It is further argued that the formations that involve bound/archaic roots are actually cases of either root compounding or root affixation and not synthetic compounds.


Sign in / Sign up

Export Citation Format

Share Document