Labeling Chinese Predicates with Semantic Roles

In this article we report work on Chinese semantic role labeling, taking advantage of two recently completed corpora, the Chinese PropBank, a semantically annotated corpus of Chinese verbs, and the Chinese Nombank, a companion corpus that annotates the predicate-argument structure of nominalized predicates. Because the semantic role labels are assigned to the constituents in a parse tree, we first report experiments in which semantic role labels are automatically assigned to hand-crafted parses in the Chinese Treebank. This gives us a measure of the extent to which semantic role labels can be bootstrapped from the syntactic annotation provided in the treebank. We then report experiments using automatic parses with decreasing levels of human annotation in the input to the syntactic parser: parses that use gold-standard segmentation and POS-tagging, parses that use only gold-standard segmentation, and fully automatic parses. These experiments gauge how successful semantic role labeling for Chinese can be in more realistic situations. Our results show that when hand-crafted parses are used, semantic role labeling accuracy for Chinese is comparable to what has been reported for the state-of-the-art English semantic role labeling systems trained and tested on the English PropBank, even though the Chinese PropBank is significantly smaller in size. When an automatic parser is used, however, the accuracy of our system is significantly lower than the English state of the art. This indicates that an improvement in Chinese parsing is critical to high-performance semantic role labeling for Chinese.

Download Full-text

Neural Unsupervised Semantic Role Labeling

ACM Transactions on Asian and Low-Resource Language Information Processing ◽

10.1145/3461613 ◽

2021 ◽

Vol 20 (6) ◽

pp. 1-16

Author(s):

Kashif Munir ◽

Hai Zhao ◽

Zuchao Li

Keyword(s):

Argument Structure ◽

State Of The Art ◽

Neural Model ◽

Semantic Structure ◽

Semantic Role ◽

Semantic Role Labeling ◽

Neural Models ◽

Previous State ◽

Dependency Relations ◽

Predicate Argument Structure

The task of semantic role labeling ( SRL ) is dedicated to finding the predicate-argument structure. Previous works on SRL are mostly supervised and do not consider the difficulty in labeling each example which can be very expensive and time-consuming. In this article, we present the first neural unsupervised model for SRL. To decompose the task as two argument related subtasks, identification and clustering, we propose a pipeline that correspondingly consists of two neural modules. First, we train a neural model on two syntax-aware statistically developed rules. The neural model gets the relevance signal for each token in a sentence, to feed into a BiLSTM, and then an adversarial layer for noise-adding and classifying simultaneously, thus enabling the model to learn the semantic structure of a sentence. Then we propose another neural model for argument role clustering, which is done through clustering the learned argument embeddings biased toward their dependency relations. Experiments on the CoNLL-2009 English dataset demonstrate that our model outperforms the previous state-of-the-art baseline in terms of non-neural models for argument identification and classification.

Download Full-text

Text Rewriting Improves Semantic Role Labeling

Journal of Artificial Intelligence Research ◽

10.1613/jair.4431 ◽

2014 ◽

Vol 51 ◽

pp. 133-164 ◽

Cited By ~ 1

Author(s):

K. Woodsend ◽

M. Lapata

Keyword(s):

Gold Standard ◽

High Performance ◽

Large Scale ◽

State Of The Art ◽

The State ◽

Semantic Role ◽

Semantic Role Labeling ◽

Comparable Corpora ◽

Rewrite Rules ◽

Model Training

Large-scale annotated corpora are a prerequisite to developing high-performance NLP systems. Such corpora are expensive to produce, limited in size, often demanding linguistic expertise. In this paper we use text rewriting as a means of increasing the amount of labeled data available for model training. Our method uses automatically extracted rewrite rules from comparable corpora and bitexts to generate multiple versions of sentences annotated with gold standard labels. We apply this idea to semantic role labeling and show that a model trained on rewritten data outperforms the state of the art on the CoNLL-2009 benchmark dataset.

Download Full-text

Text Rewriting Improves Semantic Role Labeling (Extended Abstract)

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/729 ◽

2017 ◽

Author(s):

Kristian Woodsend ◽

Mirella Lapata

Keyword(s):

Gold Standard ◽

High Performance ◽

Large Scale ◽

State Of The Art ◽

The State ◽

Semantic Role ◽

Semantic Role Labeling ◽

Comparable Corpora ◽

Rewrite Rules ◽

Model Training

Download Full-text

Syntax Role for Neural Semantic Role Labeling

Computational Linguistics ◽

10.1162/coli_a_00408 ◽

2021 ◽

pp. 1-48

Author(s):

Zuchao Li ◽

Hai Zhao ◽

Shexia He ◽

Jiaxun Cai

Keyword(s):

Argument Structure ◽

Large Scale ◽

Language Models ◽

Semantic Role ◽

Semantic Role Labeling ◽

Empirical Survey ◽

Learning Framework ◽

Syntactic Information ◽

Feature Based ◽

Predicate Argument Structure

Abstract Semantic role labeling (SRL) is dedicated to recognizing the semantic predicate-argument structure of a sentence. Previous studies in terms of traditional models have shown syntactic information can make remarkable contributions to SRL performance; however, the necessity of syntactic information was challenged by a few recent neural SRL studies that demonstrate impressive performance without syntactic backbones and suggest that syntax information becomes much less important for neural semantic role labeling, especially when paired with recent deep neural network and large-scale pre-trained language models. Despite this notion, the neural SRL field still lacks a systematic and full investigation on the relevance of syntactic information in SRL, for both dependency and both monolingual and multilingual settings. This paper intends to quantify the importance of syntactic information for neural SRL in the deep learning framework. We introduce three typical SRL frameworks (baselines), sequence-based, tree-based, and graph-based, which are accompanied by two categories of exploiting syntactic information: syntax pruningbased and syntax feature-based. Experiments are conducted on the CoNLL-2005, 2009, and 2012 benchmarks for all languages available, and results show that neural SRL models can still benefit from syntactic information under certain conditions. Furthermore, we show the quantitative significance of syntax to neural SRL models together with a thorough empirical survey using existing models.

Download Full-text

Context-aware Frame-Semantic Role Labeling

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00150 ◽

2015 ◽

Vol 3 ◽

pp. 449-460 ◽

Cited By ~ 5

Author(s):

Michael Roth ◽

Mirella Lapata

Keyword(s):

Social Network ◽

Question Answering ◽

State Of The Art ◽

Semantic Role ◽

Context Aware ◽

Semantic Role Labeling ◽

Current State ◽

Sentence Level ◽

Small Set ◽

Labeling System

Frame semantic representations have been useful in several applications ranging from text-to-scene generation, to question answering and social network analysis. Predicting such representations from raw text is, however, a challenging task and corresponding models are typically only trained on a small set of sentence-level annotations. In this paper, we present a semantic role labeling system that takes into account sentence and discourse context. We introduce several new features which we motivate based on linguistic insights and experimentally demonstrate that they lead to significant improvements over the current state-of-the-art in FrameNet-based semantic role labeling.

Download Full-text

A Global Joint Model for Semantic Role Labeling

Computational Linguistics ◽

10.1162/coli.2008.34.2.161 ◽

2008 ◽

Vol 34 (2) ◽

pp. 161-191 ◽

Cited By ~ 29

Author(s):

Kristina Toutanova ◽

Aria Haghighi ◽

Christopher D. Manning

Keyword(s):

State Of The Art ◽

Joint Model ◽

Semantic Role ◽

Semantic Role Labeling ◽

Shared Task ◽

Data Set ◽

Joint Structure ◽

Proposed Model ◽

Parse Trees ◽

Linguistic Intuition

We present a model for semantic role labeling that effectively captures the linguistic intuition that a semantic argument frame is a joint structure, with strong dependencies among the arguments. We show how to incorporate these strong dependencies in a statistical joint model with a rich set of features over multiple argument phrases. The proposed model substantially outperforms a similar state-of-the-art local model that does not include dependencies among different arguments. We evaluate the gains from incorporating this joint information on the Propbank corpus, when using correct syntactic parse trees as input, and when using automatically derived parse trees. The gains amount to 24.1% error reduction on all arguments and 36.8% on core arguments for gold-standard parse trees on Propbank. For automatic parse trees, the error reductions are 8.3% and 10.3% on all and core arguments, respectively. We also present results on the CoNLL 2005 shared task data set. Additionally, we explore considering multiple syntactic analyses to cope with parser noise and uncertainty.

Download Full-text

A NEW SHALLOW SEMANTIC PARSER FOR DESCRIBING THE CONCEPT STRUCTURE OF TEXT

International Journal of Semantic Computing ◽

10.1142/s1793351x09000690 ◽

2009 ◽

Vol 03 (01) ◽

pp. 131-149

Author(s):

YULAN YAN ◽

YUTAKA MATSUO ◽

MITSURU ISHIZUKA

Keyword(s):

Natural Language ◽

Argument Structure ◽

Semantic Annotation ◽

Relation Extraction ◽

Extraction Process ◽

Preliminary Evaluation ◽

Semantic Role Labeling ◽

Thematic Relations ◽

Relation Classification ◽

Predicate Argument Structure

Recently, Semantic Role Labeling (SRL) systems have been used to examine a semantic predicate-argument structure for natural occurring texts. Facing the challenge of extracting a universal set of semantic or thematic relations covering various types of semantic relationships between entities, based on the Concept Description Language for Natural Language (CDL.nl) which defines a set of semantic relations to describe the concept structure of text, we develop a shallow semantic parser to add a new layer of semantic annotation of natural language sentences as an extension of SRL. The parsing task is a relation extraction process with two steps: relation detection and relation classification. Firstly, based on dependency analysis, a rule-based algorithm is presented to detect all entity pairs between each pair for which there exists a relationship; secondly, we use a kernel-based method to assign CDL.nl relations to detected entity pairs by leveraging diverse features. Preliminary evaluation on a manual dataset shows that CDL.nl relations can be extracted with good performance.

Download Full-text

Adding semantic roles to the Chinese Treebank

Natural Language Engineering ◽

10.1017/s1351324908004865 ◽

2009 ◽

Vol 15 (1) ◽

pp. 143-172 ◽

Cited By ~ 22

Author(s):

NIANWEN XUE ◽

MARTHA PALMER

Keyword(s):

Argument Structure ◽

Syntactic Structure ◽

Parse Tree ◽

Semantic Role ◽

Lexical Database ◽

Semantic Roles ◽

Structure Annotation ◽

Predicate Argument Structure ◽

Event Result

AbstractWe report work on adding semantic role labels to the Chinese Treebank, a corpus already annotated with phrase structures. The work involves locating all verbs and their nominalizations in the corpus, and semi-automatically adding semantic role labels to their arguments, which are constituents in a parse tree. Although the same procedure is followed, different issues arise in the annotation of verbs and nominalized predicates. For verbs, identifying their arguments is generally straightforward given their syntactic structure in the Chinese Treebank as they tend to occupy well-defined syntactic positions. Our discussion focuses on the syntactic variations in the realization of the arguments as well as our approach to annotating dislocated and discontinuous arguments. In comparison, identifying the arguments for nominalized predicates is more challenging and we discuss criteria and procedures for distinguishing arguments from non-arguments. In particular we focus on the role of support verbs as well as the relevance of event/result distinctions in the annotation of the predicate-argument structure of nominalized predicates. We also present our approach to taking advantage of the syntactic structure in the Chinese Treebank to bootstrap the predicate-argument structure annotation of verbs. Finally, we discuss the creation of a lexical database of frame files and its role in guiding predicate-argument annotation. Procedures for ensuring annotation consistency and inter-annotator agreement evaluation results are also presented.

Download Full-text

Dependency parsing of Polish

Poznan Studies in Contemporary Linguistics ◽

10.1515/psicl-2019-0012 ◽

2019 ◽

Vol 55 (2) ◽

pp. 305-337 ◽

Cited By ~ 1

Author(s):

Alina Wróblewska ◽

Piotr Rybak

Keyword(s):

Language Processing ◽

Argument Structure ◽

Question Answering ◽

State Of The Art ◽

Dependency Parsing ◽

Dependency Theory ◽

Crucial Issue ◽

Experimental Part ◽

Predicate Argument Structure

Abstract The predicate-argument structure transparently encoded in dependency-based syntactic representations supports machine translation, question answering, information extraction, etc. The quality of dependency parsing is therefore a crucial issue in natural language processing. In the current paper we discuss the fundamental ideas of the dependency theory and provide an overview of selected dependency-based resources for Polish. Furthermore, we present some state-of-the-art dependency parsing systems whose models can be estimated on correctly annotated data. In the experimental part, we provide an in-depth evaluation of these systems on Polish data. Our results show that graph-based parsers, even those without any neural component, are better suited for Polish than transition-based parsing systems.

Download Full-text

Selectional Preferences for Semantic Role Classification

Computational Linguistics ◽

10.1162/coli_a_00145 ◽

2013 ◽

Vol 39 (3) ◽

pp. 631-663 ◽

Cited By ~ 5

Author(s):

Beñat Zapirain ◽

Eneko Agirre ◽

Lluís Màrquez ◽

Mihai Surdeanu

Keyword(s):

Error Analysis ◽

State Of The Art ◽

Error Reduction ◽

Semantic Role ◽

Semantic Role Labeling ◽

Syntactic Information ◽

Open Issue ◽

Selectional Preferences ◽

Post Hoc ◽

Labeling System

This paper focuses on a well-known open issue in Semantic Role Classification (SRC) research: the limited influence and sparseness of lexical features. We mitigate this problem using models that integrate automatically learned selectional preferences (SP). We explore a range of models based on WordNet and distributional-similarity SPs. Furthermore, we demonstrate that the SRC task is better modeled by SP models centered on both verbs and prepositions, rather than verbs alone. Our experiments with SP-based models in isolation indicate that they outperform a lexical baseline with 20 F1 points in domain and almost 40 F1 points out of domain. Furthermore, we show that a state-of-the-art SRC system extended with features based on selectional preferences performs significantly better, both in domain (17% error reduction) and out of domain (13% error reduction). Finally, we show that in an end-to-end semantic role labeling system we obtain small but statistically significant improvements, even though our modified SRC model affects only approximately 4% of the argument candidates. Our post hoc error analysis indicates that the SP-based features help mostly in situations where syntactic information is either incorrect or insufficient to disambiguate the correct role.

Download Full-text