scholarly journals The Proposition Bank: An Annotated Corpus of Semantic Roles

2005 ◽  
Vol 31 (1) ◽  
pp. 71-106 ◽  
Author(s):  
Martha Palmer ◽  
Daniel Gildea ◽  
Paul Kingsbury

The Proposition Bank project takes a practical approach to semantic representation, adding a layer of predicate-argument information, or semantic role labels, to the syntactic structures of the Penn Treebank. The resulting resource can be thought of as shallow, in that it does not represent coreference, quantification, and many other higher-order phenomena, but also broad, in that it covers every instance of every verb in the corpus and allows representative statistics to be calculated. We discuss the criteria used to define the sets of semantic roles used in the annotation process and to analyze the frequency of syntactic/semantic alternations in the corpus. We describe an automatic system for semantic role tagging trained on the corpus and discuss the effect on its performance of various types of information, including a comparison of full syntactic parsing with a flat representation and the contribution of the empty “trace” categories of the treebank.

2021 ◽  
Vol 44 (1) ◽  
pp. 101-136
Author(s):  
Lidija Iordanskaja ◽  
Igor Mel’čuk

Abstract A formal linguistitic model is presented, which produces, for a given conceptual representation of an extralinguistic situation, a corresponding semantic representation [SemR] that, in its turn, underlies the deep-syntactic representations of four near-synonymous Russian sentences expressing the starting information. Two full-fledged lexical entries are given for the lexemes besporjadki ‘disturbance’ and stolknovenie ‘clash(N)’, appearing in these sentences. Some principles of lexicalization – that is, matching the formal lexicographic definitions to the starting semantic representation in order to produce the deep-syntactic structures of the corresponding sentences – are formulated and illustrated; the problem of approximate matching is dealt with in sufficient detail.


2019 ◽  
Vol 5 (2) ◽  
pp. 122-129
Author(s):  
A.A.A Ngr. Adriyanti Weda Ningrat ◽  
I Nyoman Kardana ◽  
Mirsa Umiyati

This study reveals the semantic fields from the "to see" verb in Javanese. The aims of this research is to describe the shape, function, meaning and role of semantic of each variant of the verb "to see". To realize this goal, qualitative research design was applied in this study and the semantic role theory of Vole and Van Valin (1984). Was also oriented in data analysis. Verbs that have semantic fields that are associated with the "to see" verb with intentional entities numbering 33. Each of them is ndêlok, ndêlêng, ningali, mirsani, ndeleng sacleraman, ndêlêng tênanên, ningali saestu, mirsani saestu, ningali sekedhap, mirsani sekedhap, mlengos, ngwasi, ngêmatake, ngematakên, ndhangak, dingkluk, nginceng, ngêlirêk, mêntêlêngi, ndelok mburi, and maca. The semantic roles of the arguments of each verb consist of agents and themes. This study only sheds light on the meaning field verb "to see" of the type of variant and a little about the general semantic role. For this reason, a more detailed study of the specific role of each variant of the verb is a topic that can be raised in the next study.


Author(s):  
Qingrong Xia ◽  
Zhenghua Li ◽  
Min Zhang ◽  
Meishan Zhang ◽  
Guohong Fu ◽  
...  

Semantic role labeling (SRL), also known as shallow semantic parsing, is an important yet challenging task in NLP. Motivated by the close correlation between syntactic and semantic structures, traditional discrete-feature-based SRL approaches make heavy use of syntactic features. In contrast, deep-neural-network-based approaches usually encode the input sentence as a word sequence without considering the syntactic structures. In this work, we investigate several previous approaches for encoding syntactic trees, and make a thorough study on whether extra syntax-aware representations are beneficial for neural SRL models. Experiments on the benchmark CoNLL-2005 dataset show that syntax-aware SRL approaches can effectively improve performance over a strong baseline with external word representations from ELMo. With the extra syntax-aware representations, our approaches achieve new state-of-the-art 85.6 F1 (single model) and 86.6 F1 (ensemble) on the test data, outperforming the corresponding strong baselines with ELMo by 0.8 and 1.0, respectively. Detailed error analysis are conducted to gain more insights on the investigated approaches.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Leilei Kong ◽  
Zhongyuan Han ◽  
Yong Han ◽  
Haoliang Qi

Paraphrase identification is central to many natural language applications. Based on the insight that a successful paraphrase identification model needs to adequately capture the semantics of the language objects as well as their interactions, we present a deep paraphrase identification model interacting semantics with syntax (DPIM-ISS) for paraphrase identification. DPIM-ISS introduces the linguistic features manifested in syntactic features to produce more explicit structures and encodes the semantic representation of sentence on different syntactic structures by means of interacting semantics with syntax. Then, DPIM-ISS learns the paraphrase pattern from this representation interacting the semantics with syntax by exploiting a convolutional neural network with convolution-pooling structure. Experiments are conducted on the corpus of Microsoft Research Paraphrase (MSRP), PAN 2010 corpus, and PAN 2012 corpus for paraphrase plagiarism detection. The experimental results demonstrate that DPIM-ISS outperforms the classical word-matching approaches, the syntax-similarity approaches, the convolution neural network-based models, and some deep paraphrase identification models.


Entropy ◽  
2020 ◽  
Vol 22 (4) ◽  
pp. 446
Author(s):  
Yair Lakretz ◽  
Stanislas Dehaene ◽  
Jean-Rémi King

Sentence comprehension requires inferring, from a sequence of words, the structure of syntactic relationships that bind these words into a semantic representation. Our limited ability to build some specific syntactic structures, such as nested center-embedded clauses (e.g., “The dog that the cat that the mouse bit chased ran away”), suggests a striking capacity limitation of sentence processing, and thus offers a window to understand how the human brain processes sentences. Here, we review the main hypotheses proposed in psycholinguistics to explain such capacity limitation. We then introduce an alternative approach, derived from our recent work on artificial neural networks optimized for language modeling, and predict that capacity limitation derives from the emergence of sparse and feature-specific syntactic units. Unlike psycholinguistic theories, our neural network-based framework provides precise capacity-limit predictions without making any a priori assumptions about the form of the grammar or parser. Finally, we discuss how our framework may clarify the mechanistic underpinning of language processing and its limitations in the human brain.


2014 ◽  
Vol 38 (3) ◽  
pp. 463-484 ◽  
Author(s):  
Iren Hartmann ◽  
Martin Haspelmath ◽  
Michael Cysouw

In this paper, we illustrate a method for identifying clusters of semantic roles by cross-linguistic comparison. On the basis of data from 25 languages drawn from the ValPaL (Valency Patterns Leipzig) database, we show how one can visualize coexpression tendencies using quantitative methods (in particular, multidimensional scaling). Traditionally, the coexpression of semantic microroles (such as the breaker and the broken thing of the ‘break’ verb, the helper and the helpee of the ‘help’ verb, etc.) has been studied for particular languages, with generalized macroroles such as “agent”, “actor”, and “undergoer” being compared across languages in a next step. We set up a conceptual space of 87 microroles based on their coexpression tendencies, i.e. the extent to which they are expressed identically (via flagging and indexing) across our languages. The individual coding means (cases, adpositions, index-sets) can then be mapped onto this conceptual space, revealing broader alignment patterns.


The research deals with the original algorithms of the linguistic processor integration for solving planimetric problems. The linguistic processor translates the natural language description of the problem into a semantic representation based on the ontology that supports the axiomatics of geometry. The linguistic processor synthesizes natural-language comments to the solution and drawing objects. The method of interactive visualization of the linguistic processor functioning is proposed. The method provides a step-by-step dialog control of syntactic structure construction and its display in semantic representation. During the experiments, several dozens of standard syntactic structures correctly displayed in the semantic structures of the subject area were obtained. The direction of further research related to the development of the proposed approach is outlined.


Sign in / Sign up

Export Citation Format

Share Document