scholarly journals Provable Limitations of Acquiring Meaning from Ungrounded Form: What Will Future Language Models Understand?

2021 ◽  
Vol 9 ◽  
pp. 1047-1060
Author(s):  
William Merrill ◽  
Yoav Goldberg ◽  
Roy Schwartz ◽  
Noah A. Smith

Abstract Language models trained on billions of tokens have recently led to unprecedented results on many NLP tasks. This success raises the question of whether, in principle, a system can ever “understand” raw text without access to some form of grounding. We formally investigate the abilities of ungrounded systems to acquire meaning. Our analysis focuses on the role of “assertions”: textual contexts that provide indirect clues about the underlying semantics. We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence. We find that assertions enable semantic emulation of languages that satisfy a strong notion of semantic transparency. However, for classes of languages where the same expression can take different values in different contexts, we show that emulation can become uncomputable. Finally, we discuss differences between our formal model and natural language, exploring how our results generalize to a modal setting and other semantic relations. Together, our results suggest that assertions in code or language do not provide sufficient signal to fully emulate semantic representations. We formalize ways in which ungrounded language models appear to be fundamentally limited in their ability to “understand”.

Psihologija ◽  
2013 ◽  
Vol 46 (4) ◽  
pp. 455-478 ◽  
Author(s):  
Hien Pham ◽  
Harald Baayen

According to the CARIN theory of Gagn? and Shoben (1997), conceptual relations play an important role in compound interpretation. This study develops three measures gauging the role of conceptual relations, and pits these measures against measures based on latent semantic analysis (Landauer & Dumais, 1997). The CARIN measures successfully predict response latencies in a familiarity categorization task, in a semantic transparency task, and in visual lexical decision. Of the measures based on latent semantic analysis, only a measure orthogonal to the conceptual relations, which instead gauges the extent to which the concepts for the compound?s head and the compound itself are discriminated, also reached significance. Results further indicate that in tasks requiring careful assessment of the meaning of the compound, general knowledge of conceptual relations plays a central role, whereas in the lexical decision task, attention shifts to co-activated meanings and the specifics of the conceptual relations realized in the compound?s modifier family.


Author(s):  
Rinalds Vīksna ◽  
Inguna Skadiņa

Transformer-based language models pre-trained on large corpora have demonstrated good results on multiple natural language processing tasks for widely used languages including named entity recognition (NER). In this paper, we investigate the role of the BERT models in the NER task for Latvian. We introduce the BERT model pre-trained on the Latvian language data. We demonstrate that the Latvian BERT model, pre-trained on large Latvian corpora, achieves better results (81.91 F1-measure on average vs 78.37 on M-BERT for a dataset with nine named entity types, and 79.72 vs 78.83 on another dataset with seven types) than multilingual BERT and outperforms previously developed Latvian NER systems.


2019 ◽  
Vol 23 (4) ◽  
pp. 797-826 ◽  
Author(s):  
TINE BREBAN ◽  
JULIA KOLKMANN ◽  
JOHN PAYNE

In this article we investigate the role of semantic relations in grammatical alternations. The specific alternation we look at is that between the proper name modifier construction, e.g.the Obama government, and the determiner genitive, e.g.Obama's government. Through the use of an experimental study in which participants were asked to rate the naturalness of the two constructions in 20 attested natural language contexts and provide paraphrases of the semantic relations in question, we tested when the two constructions alternate and whether either construction expresses semantic relations that block alternation. Our initial finding is that none of the relations we studied is categorically associated with only one of the constructions, but that certain relations – notably possession and name – are far more preferentially associated with determiner genitives and proper name modifiers respectively. Despite these ‘default’ associations, participants nevertheless identified a range of possible interpretations for many of the examples, meaning that our study simultaneously supports the opposing theoretical views of default relations and semantic underspecification. Further, our study validates the inclusion of semantic relations in genitive alternation studies as a major factor despite the notorious difficulties in their operationalisation. Animacy distinctions, although more straightforward to codify, appear to be of lesser importance. Methodologically, our study shows the value of an experimental approach as a corrective to researcher intuitions about the identification of semantic relations in context.


2020 ◽  
Author(s):  
Vadim V. Korolev ◽  
Artem Mitrofanov ◽  
Kirill Karpov ◽  
Valery Tkachenko

The main advantage of modern natural language processing methods is a possibility to turn an amorphous human-readable task into a strict mathematic form. That allows to extract chemical data and insights from articles and to find new semantic relations. We propose a universal engine for processing chemical and biological texts. We successfully tested it on various use-cases and applied to a case of searching a therapeutic agent for a COVID-19 disease by analyzing PubMed archive.


Author(s):  
Santiago Zanella-Béguelin ◽  
Lukas Wutschitz ◽  
Shruti Tople ◽  
Victor Rühle ◽  
Andrew Paverd ◽  
...  

Science ◽  
2021 ◽  
Vol 371 (6526) ◽  
pp. 284-288 ◽  
Author(s):  
Brian Hie ◽  
Ellen D. Zhong ◽  
Bonnie Berger ◽  
Bryan Bryson

The ability for viruses to mutate and evade the human immune system and cause infection, called viral escape, remains an obstacle to antiviral and vaccine development. Understanding the complex rules that govern escape could inform therapeutic design. We modeled viral escape with machine learning algorithms originally developed for human natural language. We identified escape mutations as those that preserve viral infectivity but cause a virus to look different to the immune system, akin to word changes that preserve a sentence’s grammaticality but change its meaning. With this approach, language models of influenza hemagglutinin, HIV-1 envelope glycoprotein (HIV Env), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) Spike viral proteins can accurately predict structural escape patterns using sequence data alone. Our study represents a promising conceptual bridge between natural language and viral evolution.


2010 ◽  
Vol 22 (9) ◽  
pp. 1955-1969 ◽  
Author(s):  
Atira S. Bick ◽  
Ram Frost ◽  
Gadi Goelman

Is morphology a discrete and independent element of lexical structure or does it simply reflect a fine-tuning of the system to the statistical correlation that exists among orthographic and semantic properties of words? Hebrew provides a unique opportunity to examine morphological processing in the brain because of its rich morphological system. In an fMRI masked priming experiment, we investigated the neural networks involved in implicit morphological processing in Hebrew. In the lMFG and lIFG, activation was found to be significantly reduced when the primes were morphologically related to the targets. This effect was not influenced by the semantic transparency of the morphological prime, and was not found in the semantic or orthographic condition. Additional morphologically related decrease in activation was found in the lIPL, where activation was significantly modulated by semantic transparency. Our findings regarding implicit morphological processing suggest that morphology is an automatic and distinct aspect of visually processing words. These results also coincide with the behavioral data previously obtained demonstrating the central role of morphological processing in reading Hebrew.


Author(s):  
Ekaterina Savitskaya ◽  

In the field of cognitive linguistics it is accepted that, before developing its capacity for abstract and theoretical thought, the human mind went through the stage of reflecting reality through concrete images and thus has inherited old cognitive patterns. Even abstract notions of the modern civilization are based on traditional concrete images, and it is all fixed in natural language units. By way of illustration, the author analyzes the cognitive pattern “сleanness / dirtiness” as a constituent part of the English linguoculture, looking at the whole range of its verbal realization and demonstrating its influence on language-based thinking and modeling of reality. Comparing meanings of language units with their inner forms enabled the author to establish the connection between abstract notions and concrete images within cognitive patterns. Using the method of internal comparison and applying the results of etymological reconstruction of language units’ inner form made it possible to see how the world is viewed by representatives of the English linguoculture. Apparently, in the English linguoculture images of cleanness / dirtiness symbolize mainly two thematic areas: that of morality and that of renewal. Since every ethnic group has its own axiological dominants (key values) that determine the expressiveness of verbal invectives, one can draw the conclusion that people perceive and comprehend world fragments through the prism of mental stereo-types fixed in the inner form of language units. Sometimes, in relation to specific language units, a conflict arises between the inner form which retains traditional thinking and a meaning that reflects modern reality. Still, linguoculture is a constantly evolving entity, and its de-velopment entails breaking established stereotypes and creating new ones. Linguistically, the victory of the new over the old is manifested in the “dying out” of the verbal support for pre-vious cognitive patterns, which leads to “reprogramming” (“recoding”) of linguoculture rep-resentatives’ mentality.


Sign in / Sign up

Export Citation Format

Share Document