scholarly journals Universal Dependencies

2021 ◽  
pp. 1-52
Author(s):  
Marie-Catherine de Marneffe ◽  
Christopher D. Manning ◽  
Joakim Nivre ◽  
Daniel Zeman

Abstract Universal dependencies (UD) is a framework for morphosyntactic annotation of human language, which to date has been used to create treebanks for more than 100 languages. In this article, we outline the linguistic theory of the UD framework, which draws on a long tradition of typologically oriented grammatical theories. Grammatical relations between words are centrally used to explain how predicate–argument structures are encoded morphosyntactically in different languages while morphological features and part-of-speech classes give the properties of words. We argue that this theory is a good basis for cross-linguistically consistent annotation of typologically diverse languages in a way that supports computational natural language understanding as well as broader linguistic studies.

Triangle ◽  
2018 ◽  
pp. 65
Author(s):  
Veronica Dahl

Natural Language Processing aims to give computers the power to automatically process human language sentences, mostly in written text form but also spoken, for various purposes. This sub-discipline of AI (Artificial Intelligence) is also known as Natural Language Understanding.


Author(s):  
Vasile Rus ◽  
Philip M. McCarthy ◽  
Danielle S. McNamara ◽  
Arthur C. Graesser

Natural language understanding and assessment is a subset of natural language processing (NLP). The primary purpose of natural language understanding algorithms is to convert written or spoken human language into representations that can be manipulated by computer programs. Complex learning environments such as intelligent tutoring systems (ITSs) often depend on natural language understanding for fast and accurate interpretation of human language so that the system can respond intelligently in natural language. These ITSs function by interpreting the meaning of student input, assessing the extent to which it manifests learning, and generating suitable feedback to the learner. To operate effectively, systems need to be fast enough to operate in the real time environments of ITSs. Delays in feedback caused by computational processing run the risk of frustrating the user and leading to lower engagement with the system. At the same time, the accuracy of assessing student input is critical because inaccurate feedback can potentially compromise learning and lower the student’s motivation and metacognitive awareness of the learning goals of the system (Millis et al., 2007). As such, student input in ITSs requires an assessment approach that is fast enough to operate in real time but accurate enough to provide appropriate evaluation. One of the ways in which ITSs with natural language understanding verify student input is through matching. In some cases, the match is between the user input and a pre-selected stored answer to a question, solution to a problem, misconception, or other form of benchmark response. In other cases, the system evaluates the degree to which the student input varies from a complex representation or a dynamically computed structure. The computation of matches and similarity metrics are limited by the fidelity and flexibility of the computational linguistics modules. The major challenge with assessing natural language input is that it is relatively unconstrained and rarely follows brittle rules in its computation of spelling, syntax, and semantics (McCarthy et al., 2007). Researchers who have developed tutorial dialogue systems in natural language have explored the accuracy of matching students’ written input to targeted knowledge. Examples of these systems are AutoTutor and Why-Atlas, which tutor students on Newtonian physics (Graesser, Olney, Haynes, & Chipman, 2005; VanLehn , Graesser, et al., 2007), and the iSTART system, which helps students read text at deeper levels (McNamara, Levinstein, & Boonthum, 2004). Systems such as these have typically relied on statistical representations, such as latent semantic analysis (LSA; Landauer, McNamara, Dennis, & Kintsch, 2007) and content word overlap metrics (McNamara, Boonthum, et al., 2007). Indeed, such statistical and word overlap algorithms can boast much success. However, over short dialogue exchanges (such as those in ITSs), the accuracy of interpretation can be seriously compromised without a deeper level of lexico-syntactic textual assessment (McCarthy et al., 2007). Such a lexico-syntactic approach, entailment evaluation, is presented in this chapter. The approach incorporates deeper natural language processing solutions for ITSs with natural language exchanges while remaining sufficiently fast to provide real time assessment of user input.


1998 ◽  
Vol 37 (04/05) ◽  
pp. 327-333 ◽  
Author(s):  
F. Buekens ◽  
G. De Moor ◽  
A. Waagmeester ◽  
W. Ceusters

AbstractNatural language understanding systems have to exploit various kinds of knowledge in order to represent the meaning behind texts. Getting this knowledge in place is often such a huge enterprise that it is tempting to look for systems that can discover such knowledge automatically. We describe how the distinction between conceptual and linguistic semantics may assist in reaching this objective, provided that distinguishing between them is not done too rigorously. We present several examples to support this view and argue that in a multilingual environment, linguistic ontologies should be designed as interfaces between domain conceptualizations and linguistic knowledge bases.


1995 ◽  
Vol 34 (04) ◽  
pp. 345-351 ◽  
Author(s):  
A. Burgun ◽  
L. P. Seka ◽  
D. Delamarre ◽  
P. Le Beux

Abstract:In medicine, as in other domains, indexing and classification is a natural human task which is used for information retrieval and representation. In the medical field, encoding of patient discharge summaries is still a manual time-consuming task. This paper describes an automated coding system of patient discharge summaries from the field of coronary diseases into the ICD-9-CM classification. The system is developed in the context of the European AIM MENELAS project, a natural-language understanding system which uses the conceptual-graph formalism. Indexing is performed by using a two-step processing scheme; a first recognition stage is implemented by a matching procedure and a secondary selection stage is made according to the coding priorities. We show the general features of the necessary translation of the classification terms in the conceptual-graph model, and for the coding rules compliance. An advantage of the system is to provide an objective evaluation and assessment procedure for natural-language understanding.


Sign in / Sign up

Export Citation Format

Share Document