A NEW SHALLOW SEMANTIC PARSER FOR DESCRIBING THE CONCEPT STRUCTURE OF TEXT

2009 ◽  
Vol 03 (01) ◽  
pp. 131-149
Author(s):  
YULAN YAN ◽  
YUTAKA MATSUO ◽  
MITSURU ISHIZUKA

Recently, Semantic Role Labeling (SRL) systems have been used to examine a semantic predicate-argument structure for natural occurring texts. Facing the challenge of extracting a universal set of semantic or thematic relations covering various types of semantic relationships between entities, based on the Concept Description Language for Natural Language (CDL.nl) which defines a set of semantic relations to describe the concept structure of text, we develop a shallow semantic parser to add a new layer of semantic annotation of natural language sentences as an extension of SRL. The parsing task is a relation extraction process with two steps: relation detection and relation classification. Firstly, based on dependency analysis, a rule-based algorithm is presented to detect all entity pairs between each pair for which there exists a relationship; secondly, we use a kernel-based method to assign CDL.nl relations to detected entity pairs by leveraging diverse features. Preliminary evaluation on a manual dataset shows that CDL.nl relations can be extracted with good performance.

2021 ◽  
pp. 1-48
Author(s):  
Zuchao Li ◽  
Hai Zhao ◽  
Shexia He ◽  
Jiaxun Cai

Abstract Semantic role labeling (SRL) is dedicated to recognizing the semantic predicate-argument structure of a sentence. Previous studies in terms of traditional models have shown syntactic information can make remarkable contributions to SRL performance; however, the necessity of syntactic information was challenged by a few recent neural SRL studies that demonstrate impressive performance without syntactic backbones and suggest that syntax information becomes much less important for neural semantic role labeling, especially when paired with recent deep neural network and large-scale pre-trained language models. Despite this notion, the neural SRL field still lacks a systematic and full investigation on the relevance of syntactic information in SRL, for both dependency and both monolingual and multilingual settings. This paper intends to quantify the importance of syntactic information for neural SRL in the deep learning framework. We introduce three typical SRL frameworks (baselines), sequence-based, tree-based, and graph-based, which are accompanied by two categories of exploiting syntactic information: syntax pruningbased and syntax feature-based. Experiments are conducted on the CoNLL-2005, 2009, and 2012 benchmarks for all languages available, and results show that neural SRL models can still benefit from syntactic information under certain conditions. Furthermore, we show the quantitative significance of syntax to neural SRL models together with a thorough empirical survey using existing models.


2008 ◽  
Vol 34 (2) ◽  
pp. 225-255 ◽  
Author(s):  
Nianwen Xue

In this article we report work on Chinese semantic role labeling, taking advantage of two recently completed corpora, the Chinese PropBank, a semantically annotated corpus of Chinese verbs, and the Chinese Nombank, a companion corpus that annotates the predicate-argument structure of nominalized predicates. Because the semantic role labels are assigned to the constituents in a parse tree, we first report experiments in which semantic role labels are automatically assigned to hand-crafted parses in the Chinese Treebank. This gives us a measure of the extent to which semantic role labels can be bootstrapped from the syntactic annotation provided in the treebank. We then report experiments using automatic parses with decreasing levels of human annotation in the input to the syntactic parser: parses that use gold-standard segmentation and POS-tagging, parses that use only gold-standard segmentation, and fully automatic parses. These experiments gauge how successful semantic role labeling for Chinese can be in more realistic situations. Our results show that when hand-crafted parses are used, semantic role labeling accuracy for Chinese is comparable to what has been reported for the state-of-the-art English semantic role labeling systems trained and tested on the English PropBank, even though the Chinese PropBank is significantly smaller in size. When an automatic parser is used, however, the accuracy of our system is significantly lower than the English state of the art. This indicates that an improvement in Chinese parsing is critical to high-performance semantic role labeling for Chinese.


2021 ◽  
Vol 9 ◽  
pp. 226-242
Author(s):  
Zhaofeng Wu ◽  
Hao Peng ◽  
Noah A. Smith

Abstract For natural language processing systems, two kinds of evidence support the use of text representations from neural language models “pretrained” on large unannotated corpora: performance on application-inspired benchmarks (Peters et al., 2018, inter alia), and the emergence of syntactic abstractions in those representations (Tenney et al., 2019, inter alia). On the other hand, the lack of grounded supervision calls into question how well these representations can ever capture meaning (Bender and Koller, 2020). We apply novel probes to recent language models— specifically focusing on predicate-argument structure as operationalized by semantic dependencies (Ivanova et al., 2012)—and find that, unlike syntax, semantics is not brought to the surface by today’s pretrained models. We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning, yielding benefits to natural language understanding (NLU) tasks in the GLUE benchmark. This approach demonstrates the potential for general-purpose (rather than task-specific) linguistic supervision, above and beyond conventional pretraining and finetuning. Several diagnostics help to localize the benefits of our approach.1


Author(s):  
Kashif Munir ◽  
Hai Zhao ◽  
Zuchao Li

The task of semantic role labeling ( SRL ) is dedicated to finding the predicate-argument structure. Previous works on SRL are mostly supervised and do not consider the difficulty in labeling each example which can be very expensive and time-consuming. In this article, we present the first neural unsupervised model for SRL. To decompose the task as two argument related subtasks, identification and clustering, we propose a pipeline that correspondingly consists of two neural modules. First, we train a neural model on two syntax-aware statistically developed rules. The neural model gets the relevance signal for each token in a sentence, to feed into a BiLSTM, and then an adversarial layer for noise-adding and classifying simultaneously, thus enabling the model to learn the semantic structure of a sentence. Then we propose another neural model for argument role clustering, which is done through clustering the learned argument embeddings biased toward their dependency relations. Experiments on the CoNLL-2009 English dataset demonstrate that our model outperforms the previous state-of-the-art baseline in terms of non-neural models for argument identification and classification.


2001 ◽  
Vol 27 (3) ◽  
pp. 373-408 ◽  
Author(s):  
Paola Merlo ◽  
Suzanne Stevenson

Automatic acquisition of lexical knowledge is critical to a wide range of natural language processing tasks. Especially important is knowledge about verbs, which are the primary source of relational information in a sentence-the predicate-argument structure that relates an action or state to its participants (i.e., who did what to whom). In this work, we report on supervised learning experiments to automatically classify three major types of English verbs, based on their argument structure-specifically, the thematic roles they assign to participants. We use linguistically-motivated statistical indicators extracted from large annotated corpora to train the classifier, achieving 69.8% accuracy for a task whose baseline is 34%, and whose expert-based upper bound we calculate at 86.5%. A detailed analysis of the performance of the algorithm and of its errors confirms that the proposed features capture properties related to the argument structure of the verbs. Our results validate our hypotheses that knowledge about thematic relations is crucial for verb classification, and that it can be gleaned from a corpus by automatic means. We thus demonstrate an effective combination of deeper linguistic knowledge with the robustness and scalability of statistical techniques.


2020 ◽  
Vol 7 (5) ◽  
pp. 563-576
Author(s):  
Jaeyeol Song ◽  
Jin-Kook Lee ◽  
Jungsik Choi ◽  
Inhan Kim

Abstract This paper describes an approach to extracting a predicate-argument structure (PAS) in building design rule sentences using natural language processing (NLP) and deep learning models. For the computer to reason about the compliance of building design, design rules represented by natural language must be converted into a computer-readable format. The rule interpretation and translation processes are challenging tasks because of the vagueness and ambiguity of natural language. Many studies have proposed approaches to address this problem, but most of these are dependent on manual tasks, which is the bottleneck to expanding the scope of design rule checking to design requirements from various documents. In this paper, we apply deep learning-based NLP techniques for translating design rule sentences into a computer-readable data structure. To apply deep learning-based NLP techniques to the rule interpretation process, we identified the semantic role elements of building design requirements and defined a PAS for design rule checking. Using a bidirectional long short-term memory model with a conditional random field layer, the computer can intelligently analyze constituents of building design rule sentences and automatically extract the logical elements. The proposed approach contributes to broadening the scope of building information modeling-enabled rule checking to any natural language-based design requirements.


Author(s):  
Diane Massam

This book presents a detailed descriptive and theoretical examination of predicate-argument structure in Niuean, a Polynesian language within the Oceanic branch of the Austronesian family, spoken mainly on the Pacific island of Niue and in New Zealand. Niuean has VSO word order and an ergative case-marking system, both of which raise questions for a subject-predicate view of sentence structure. Working within a broadly Minimalist framework, this volume develops an analysis in which syntactic arguments are not merged locally to their thematic sources, but instead are merged high, above an inverted extended predicate which serves syntactically as the Niuean verb, later undergoing movement into the left periphery of the clause. The thematically lowest argument merges as an absolutive inner subject, with higher arguments merging as applicatives. The proposal relates Niuean word order and ergativity to its isolating morphology, by equating the absence of inflection with the absence of IP in Niuean, which impacts many aspects of its grammar. As well as developing a novel analysis of clause and argument structure, word order, ergative case, and theta role assignment, the volume argues for an expanded understanding of subjecthood. Throughout the volume, many other topics are also treated, such as noun incorporation, word formation, the parallel internal structure of predicates and arguments, null arguments, displacement typology, the role of determiners, and the structure of the left periphery.


Author(s):  
Wing-Kwong Wong ◽  
Sheng-Kai Yin ◽  
Chang-Zhe Yang

<p>This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the help of the knowledge base engine InfoMap, geometric concepts are extracted from an input text. The concepts are then used to output a multistep JavaSketchpad script, which constructs the dynamic geometry figure on a web page. Finally, the system outputs the script as an HTML document that can be visualized and read with an internet browser. Furthermore, a preliminary evaluation of the tool showed that it produced correct dynamic geometric figures for over 90% of problems from textbooks. With such high accuracy, the system produced by this study can support distance learning for geometry students as well as distance learning in producing geometry content for instructors.<br /><br /></p>


Sign in / Sign up

Export Citation Format

Share Document