scholarly journals Automatic construction of Korean verbal type hierarchy using Treebank

Author(s):  
Sanghoun Song ◽  
Jae-Woong Choe

The lexical information of verbal lexemes, such as verbs and adjectives, plays an important role in syntactic parsing, because the structure of a sentence mainly hinges on the type of verbal lexemes. The question we address in this research is how to acquire the argument structure (henceforth ARG-ST) of verbal lexemes in Korean. It is well known that manual build-up of type hierarchy usually cost too much time and resources, so an alternative method, namely automatic collection of relevant information is much more preferred. This paper proposes a procedure to automatically collect ARG-ST of Korean verbal lexemes from a Korean Treebank. Specifically, the system we develop in this paper first extracts lexical information of ARG-ST of verbal lexemes from a 0.8 million graphic word Korean Treebank in an unsupervised way, checks the hierarchical relationship among them, and builds up the type hierarchy automatically. The result is written in an HPSG-style annotation, thus making it possible to readily implement the result in an HPSG-based parser for Korean. Finally, the result is evaluated with reference to two Korean dictionaries and also with respect to a manually constructed type hierarchy.

English Today ◽  
2015 ◽  
Vol 31 (3) ◽  
pp. 46-58 ◽  
Author(s):  
Sergio Torres-Martínez

The central issue of the present article is the analysis of phrasal verbs (hereafter termed multiword verbs [MWVs]) from the perspective of construction grammars (Goldberg, 1995; Suttle and Goldberg, 2011). As is well known, English MWVs present special challenges to L2 learners due, among other things, to the shapelessness of their conceptual components and the ensuing impossibility to arrive at equivalent word-meaning correspondences (mappings) in the learners’ mother language (see Gillette et al., 1999). This brings us to the first theoretical claim of this paper – namely, that MWVs (also termed phrasal verbs, verb-particle collocations, verb-particle combinations etc.) are lexical chunks that can be retrieved by speakers either as wholes, without special recourse to syntactic parsing, or as verb-particle semantic associations (Cappelle et al., 2010). This idea is combined with the notion that MWVs inherit their syntax-semantics from prototypical Argument Structure Constructions (Goldberg, 2013a) within Verb Argument Constructions (VACs) frames. VACs are thus associated with prototype verbs like ‘go‘, ‘come’, ‘get’, ‘put’, etc., to project their meaning upon less-frequent verbs occupying a V-slot frame (a verbal position). It follows that MWVs function as hyponyms that express specific semantic nuances not available in prototype verbs. For example, in the sentence ‘Arya scooped up a rock and hurled it at Joffrey's head’ (George R. R. Martin, A Game of Thrones [1996]), the verb scoop up suggests a caused motion usually conveyed by the verb LIFT, i.e. the prototype of the simple transitive Verb Argument Construction. From this vantage, it is suggested that a way to activate the weak verb-object interface is through its assignation to specific prototypes bootstrapping (providing an initial basis for) both the conceptualisation of the MWVs and their potential mapping to specific words (which I term inherited surface forms).


2001 ◽  
Vol 23 (2) ◽  
pp. 145-151 ◽  
Author(s):  
Silvina Montrul

Due to the recognition of the centrality of the lexicon for SLA theory (see the 1987 thematic issue of SSLA, edited by Susan Gass), the last few years have witnessed an increased interest in understanding lexical knowledge. As Gass (1999) reminded us, learning vocabulary in a second language is a complex task that involves much more than learning sound-meaning pairings; it also involves learning how lexical information is morphologically expressed and syntactically constrained. The present issue provides a natural sequel to the 1999 SSLA thematic issue, “Incidental L2 Vocabulary Acquisition,” by addressing some of the questions raised in that volume, in particular the questions related to the intimate relationship between syntax and semantics at the lexical interface. This issue is devoted to the L2 acquisition of verb meaning and argument structure crosslinguistically, and it explores in detail the nature of linguistic systems that L2 learners acquire in this particular domain. The six central articles offer a coherent approach to the topic, using linguistic theory to help us understand the characteristics of learner grammars. Until recently, linguistic approaches to SLA have placed a strong emphasis on understanding the acquisition of functional categories, for example, and the acquisition of the lexicon has received less attention. Understanding how the lexico-syntactic interface is mentally represented, and how it evolves during the second language acquisition process, is crucial for developing an adequate theory of L2 knowledge in general, as well as for informing theories of the lexicon.


2008 ◽  
Vol 34 (3) ◽  
pp. 357-389 ◽  
Author(s):  
Gülşen Eryiğit ◽  
Joakim Nivre ◽  
Kemal Oflazer

The suitability of different parsing methods for different languages is an important topic in syntactic parsing. Especially lesser-studied languages, typologically different from the languages for which methods have originally been developed, pose interesting challenges in this respect. This article presents an investigation of data-driven dependency parsing of Turkish, an agglutinative, free constituent order language that can be seen as the representative of a wider class of languages of similar type. Our investigations show that morphological structure plays an essential role in finding syntactic relations in such a language. In particular, we show that employing sublexical units called inflectional groups, rather than word forms, as the basic parsing units improves parsing accuracy. We test our claim on two different parsing methods, one based on a probabilistic model with beam search and the other based on discriminative classifiers and a deterministic parsing strategy, and show that the usefulness of sublexical units holds regardless of the parsing method. We examine the impact of morphological and lexical information in detail and show that, properly used, this kind of information can improve parsing accuracy substantially. Applying the techniques presented in this article, we achieve the highest reported accuracy for parsing the Turkish Treebank.


2021 ◽  
Vol 9 (14) ◽  
pp. 1-32
Author(s):  
Im Hong-Pin ◽  

This paper aims to make it clear that syntactic analysis should be based on the lexical information given in the lexicon. For this purpose, lexical information of the syntactic argument is to be taken the form like [VP NKP, _, DKP, AKP] for the ditransitive verb give in English. The argument structure projects to syntactic structure. The NKP in this structure becomes VP-subject, but there is another subject called S-subject (Sentence-Subject) below S node. This amounts to Two-Subject Hypothesis for English. Between these two subjects, there intervene Conjugation-Like Elements, enriched by close examination of English verbal conjugation. Two-Subject Hypothesis perfectly accounts for peculiarities of the Expletive There (ET)construction. Restructuring can also explain the so-called Long Distance Wh-interrogative without introducing Wh-movement, and it can also explain why the imperative verbs are taking the base forms. It can also explain the characteristics of adjective imperatives by the same principles as applied to verbal imperatives. We try to deal with the other subtle problems, to get fruitful results. Restructuring approach, we think, provides more convincing explanations than the movement one.


Author(s):  
Antonio Machicao y Priemer ◽  
Paola Fritz-Huechante

In this paper, we argue that by making a more detailed distinction of theta-roles, while at the same time investigating the correlation of case marking, theta-role assignment, and eventuality types, we can describe different psych-verb subclasses and explain their alignment patterns in Spanish and Korean. We propose a neo-Davidsonian treatment of psych-verbs in HPSG that allows us to account for the underspecification of theta-roles which are modeled in an inheritance hierarchy for semantic relations. By assuming linking properties modeled lexically, we can constrain the properties for psych-verbs which shows the mapping of semantic arguments (i.e. experiencer, stimulus-causer, subject matter and target) to the elements in the argument structure. The type hierarchy and lexical rules proposed here capture the alternation in case marking not only of the experiencer (as traditionally assumed in the literature), but also of the stimulus. This analysis leads us to a new fourfold classification of psych-verbs for both languages.


Interpreting ◽  
1997 ◽  
Vol 2 (1-2) ◽  
pp. 119-140 ◽  
Author(s):  
Morton Ann Gernsbacher ◽  
Miriam Shlesinger

In this paper we suggest that the cognitive mechanism of suppression attenuates interference in many language comprehension phenomena, and is particularly crucial when comprehension must share processing capacity with other cognitive tasks, as is manifestly the case in simultaneous interpreting. During lexical access, the mechanism of suppression attenuates the interference caused by the activation of other lexical information, such as the inappropriate meanings of homonyms. During anaphoric reference, the mechanism of suppression attenuates the interference caused by the activation of other potential referents. In this way, the referent to which the anaphor does refer becomes the most activated concept. During syntactic parsing, the mechanism of suppression attenuates the interference caused by a previous syntactic form. During metaphor comprehension, the mechanism of suppression attenuates the interference caused by a literal interpretation. During inferencing, the mechanism of suppression attenuates the interference caused by an initial but inappropriate inference. We propose therefore that suppression — a general, cognitive mechanism that attenuates interference — plays a crucial role in language comprehension and simultaneous interpretation.


2019 ◽  
Vol 25 (4) ◽  
pp. 503-517 ◽  
Author(s):  
Jussi Karlgren ◽  
Pentti Kanerva

AbstractHigh-dimensional distributed semantic spaces have proven useful and effective for aggregating and processing visual, auditory and lexical information for many tasks related to human-generated data. Human language makes use of a large and varying number of features, lexical and constructional items as well as contextual and discourse-specific data of various types, which all interact to represent various aspects of communicative information. Some of these features are mostly local and useful for the organisation of, for example, argument structure of a predication; others are persistent over the course of a discourse and necessary for achieving a reasonable level of understanding of the content.This paper describes a model for high-dimensional representation for utterance and text-level data including features such as constructions or contextual data, based on a mathematically principled and behaviourally plausible approach to representing linguistic information. The implementation of the representation is a straightforward extension of Random Indexing models previously used for lexical linguistic items. The paper shows how the implementedmodel is able to represent a broad range of linguistic features in a common integral framework of fixed dimensionality, which is computationally habitable, and which is suitable as a bridge between symbolic representations such as dependency analysis and continuous representations used, for example, in classifiers or further machine-learning approaches. This is achieved with operations on vectors that constitute a powerful computational algebra, accompanied with an associative memory for the vectors. The paper provides a technical overview of the framework and a worked through implemented example of how it can be applied to various types of linguistic features.


Author(s):  
Ernesto William De Luca

In this chapter, the author presents his approach to aggregating and maintaining Multilingual Linked Data. He describes Lexical Resources and Lexical Linked Data, presenting a hybridization that ports the largest lexical resource EuroWordNet to the Linked Open Data cloud, interlinking it with other lexical resources. Furthermore, he shows the LexiRes RDF/OWL tool that gives the possibility to navigate this lexical information, helping authors of already available lexical resources in deleting or restructuring concepts using automatic merging methods. The chapter is concluded by a discussion on personalizing information according to user preferences, filtering relevant information while taking into account the multilingual background of the user.


2019 ◽  
Vol 42 ◽  
Author(s):  
Charlie Kurth

Abstract Recent work by emotion researchers indicates that emotions have a multilevel structure. Sophisticated sentimentalists should take note of this work – for it better enables them to defend a substantive role for emotion in moral cognition. Contra May's rationalist criticisms, emotions are not only able to carry morally relevant information, but can also substantially influence moral judgment and reasoning.


Sign in / Sign up

Export Citation Format

Share Document