scholarly journals The Singleton Fallacy: Why Current Critiques of Language Models Miss the Point

2021 ◽  
Vol 4 ◽  
Author(s):  
Magnus Sahlgren ◽  
Fredrik Carlsson

This paper discusses the current critique against neural network-based Natural Language Understanding solutions known as language models. We argue that much of the current debate revolves around an argumentation error that we refer to as the singleton fallacy: the assumption that a concept (in this case, language, meaning, and understanding) refers to a single and uniform phenomenon, which in the current debate is assumed to be unobtainable by (current) language models. By contrast, we argue that positing some form of (mental) “unobtanium” as definiens for understanding inevitably leads to a dualistic position, and that such a position is precisely the original motivation for developing distributional methods in computational linguistics. As such, we argue that language models present a theoretically (and practically) sound approach that is our current best bet for computers to achieve language understanding. This understanding must however be understood as a computational means to an end.

Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1230
Author(s):  
Anda Stoica ◽  
Tibor Kadar ◽  
Camelia Lemnaru ◽  
Rodica Potolea ◽  
Mihaela Dînşoreanu

As virtual home assistants are becoming more popular, there is an emerging need for supporting languages other than English. While more wide-spread or popular languages such as Spanish, French or Hindi are already integrated into existing home assistants like Google Home or Alexa, integration of other less-known languages such as Romanian is still missing. This paper explores the problem of Natural Language Understanding (NLU) applied to a Romanian home assistant. We propose a customized capsule neural network architecture that performs intent detection and slot filling in a joint manner and we evaluate how well it handles utterances containing various levels of complexity. The capsule network model shows a significant improvement in intent detection when compared to models built using the well-known Rasa NLU tool. Through error analysis, we observe clear error patterns that occur systematically. Variability in language when expressing one intent proves to be the biggest challenge encountered by the model.


2021 ◽  
Vol 11 (5) ◽  
pp. 7598-7604
Author(s):  
H. V. T. Chi ◽  
D. L. Anh ◽  
N. L. Thanh ◽  
D. Dinh

Paraphrase identification is a crucial task in natural language understanding, especially in cross-language information retrieval. Nowadays, Multi-Task Deep Neural Network (MT-DNN) has become a state-of-the-art method that brings outstanding results in paraphrase identification [1]. In this paper, our proposed method based on MT-DNN [2] to detect similarities between English and Vietnamese sentences, is proposed. We changed the shared layers of the original MT-DNN from original the BERT [3] to other pre-trained multi-language models such as M-BERT [3] or XLM-R [4] so that our model could work on cross-language (in our case, English and Vietnamese) information retrieval. We also added some tasks as improvements to gain better results. As a result, we gained 2.3% and 2.5% increase in evaluated accuracy and F1. The proposed method was also implemented on other language pairs such as English – German and English – French. With those implementations, we got a 1.0%/0.7% improvement for English – German and a 0.7%/0.5% increase for English – French.


1998 ◽  
Vol 37 (04/05) ◽  
pp. 327-333 ◽  
Author(s):  
F. Buekens ◽  
G. De Moor ◽  
A. Waagmeester ◽  
W. Ceusters

AbstractNatural language understanding systems have to exploit various kinds of knowledge in order to represent the meaning behind texts. Getting this knowledge in place is often such a huge enterprise that it is tempting to look for systems that can discover such knowledge automatically. We describe how the distinction between conceptual and linguistic semantics may assist in reaching this objective, provided that distinguishing between them is not done too rigorously. We present several examples to support this view and argue that in a multilingual environment, linguistic ontologies should be designed as interfaces between domain conceptualizations and linguistic knowledge bases.


1995 ◽  
Vol 34 (04) ◽  
pp. 345-351 ◽  
Author(s):  
A. Burgun ◽  
L. P. Seka ◽  
D. Delamarre ◽  
P. Le Beux

Abstract:In medicine, as in other domains, indexing and classification is a natural human task which is used for information retrieval and representation. In the medical field, encoding of patient discharge summaries is still a manual time-consuming task. This paper describes an automated coding system of patient discharge summaries from the field of coronary diseases into the ICD-9-CM classification. The system is developed in the context of the European AIM MENELAS project, a natural-language understanding system which uses the conceptual-graph formalism. Indexing is performed by using a two-step processing scheme; a first recognition stage is implemented by a matching procedure and a secondary selection stage is made according to the coding priorities. We show the general features of the necessary translation of the classification terms in the conceptual-graph model, and for the coding rules compliance. An advantage of the system is to provide an objective evaluation and assessment procedure for natural-language understanding.


Sign in / Sign up

Export Citation Format

Share Document