scholarly journals Context-aware Frame-Semantic Role Labeling

2015 ◽  
Vol 3 ◽  
pp. 449-460 ◽  
Author(s):  
Michael Roth ◽  
Mirella Lapata

Frame semantic representations have been useful in several applications ranging from text-to-scene generation, to question answering and social network analysis. Predicting such representations from raw text is, however, a challenging task and corresponding models are typically only trained on a small set of sentence-level annotations. In this paper, we present a semantic role labeling system that takes into account sentence and discourse context. We introduce several new features which we motivate based on linguistic insights and experimentally demonstrate that they lead to significant improvements over the current state-of-the-art in FrameNet-based semantic role labeling.

2013 ◽  
Vol 39 (3) ◽  
pp. 631-663 ◽  
Author(s):  
Beñat Zapirain ◽  
Eneko Agirre ◽  
Lluís Màrquez ◽  
Mihai Surdeanu

This paper focuses on a well-known open issue in Semantic Role Classification (SRC) research: the limited influence and sparseness of lexical features. We mitigate this problem using models that integrate automatically learned selectional preferences (SP). We explore a range of models based on WordNet and distributional-similarity SPs. Furthermore, we demonstrate that the SRC task is better modeled by SP models centered on both verbs and prepositions, rather than verbs alone. Our experiments with SP-based models in isolation indicate that they outperform a lexical baseline with 20 F1 points in domain and almost 40 F1 points out of domain. Furthermore, we show that a state-of-the-art SRC system extended with features based on selectional preferences performs significantly better, both in domain (17% error reduction) and out of domain (13% error reduction). Finally, we show that in an end-to-end semantic role labeling system we obtain small but statistically significant improvements, even though our modified SRC model affects only approximately 4% of the argument candidates. Our post hoc error analysis indicates that the SP-based features help mostly in situations where syntactic information is either incorrect or insufficient to disambiguate the correct role.


2020 ◽  
Vol 34 (05) ◽  
pp. 8082-8090
Author(s):  
Tushar Khot ◽  
Peter Clark ◽  
Michal Guerquin ◽  
Peter Jansen ◽  
Ashish Sabharwal

Composing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition (QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice question. QASC is the first dataset to offer two desirable properties: (a) the facts to be composed are annotated in a large corpus, and (b) the decomposition into these facts is not evident from the question itself. The latter makes retrieval challenging as the system must introduce new concepts or relations in order to discover potential decompositions. Further, the reasoning model must then learn to identify valid compositions of these retrieved facts using common-sense reasoning. To help address these challenges, we provide annotation for supporting facts as well as their composition. Guided by these annotations, we present a two-step approach to mitigate the retrieval challenges. We use other multiple-choice datasets as additional training data to strengthen the reasoning model. Our proposed approach improves over current state-of-the-art language models by 11% (absolute). The reasoning and retrieval problems, however, remain unsolved as this model still lags by 20% behind human performance.


2008 ◽  
Vol 34 (2) ◽  
pp. 161-191 ◽  
Author(s):  
Kristina Toutanova ◽  
Aria Haghighi ◽  
Christopher D. Manning

We present a model for semantic role labeling that effectively captures the linguistic intuition that a semantic argument frame is a joint structure, with strong dependencies among the arguments. We show how to incorporate these strong dependencies in a statistical joint model with a rich set of features over multiple argument phrases. The proposed model substantially outperforms a similar state-of-the-art local model that does not include dependencies among different arguments. We evaluate the gains from incorporating this joint information on the Propbank corpus, when using correct syntactic parse trees as input, and when using automatically derived parse trees. The gains amount to 24.1% error reduction on all arguments and 36.8% on core arguments for gold-standard parse trees on Propbank. For automatic parse trees, the error reductions are 8.3% and 10.3% on all and core arguments, respectively. We also present results on the CoNLL 2005 shared task data set. Additionally, we explore considering multiple syntactic analyses to cope with parser noise and uncertainty.


Author(s):  
Moemmur Shahzad ◽  
Ayesha Amin ◽  
Diego Esteves ◽  
Axel-Cyrille Ngonga Ngomo

We investigate the problem of named entity recognition in the user-generated text such as social media posts. This task is rendered particularly difficult by the restricted length and limited grammatical coherence of this data type. Current state-of-the-art approaches rely on external sources such as gazetteers to alleviate some of these restrictions. We present a neural model able to outperform state of the art on this task without recurring to gazetteers or similar external sources of information. Our approach relies on word-, character-, and sentence-level information for NER in short-text. Social media posts like tweets often have associated images that may provide auxiliary context relevant to understand these texts. Hence, we also incorporate visual information and introduce an attention component which computes attention weight probabilities over textual and text-relevant visual contexts separately. Our model outperforms the current state of the art on various NER datasets. On WNUT 2016 and 2017, our model achieved 53.48\% and 50.52\% F1 score, respectively. With Multimodal model, our system also outperforms the current SOTA with an F1 score of 74\% on the multimodal dataset. Our evaluation further suggests that our model also goes beyond the current state-of-the-art on newswire data, hence corroborating its suitability for various NER tasks.


Author(s):  
Yu Wang ◽  
Hongxia Jin

In this paper, we present a multi-step coarse to fine question answering (MSCQA) system which can efficiently processes documents with different lengths by choosing appropriate actions. The system is designed using an actor-critic based deep reinforcement learning model to achieve multistep question answering. Compared to previous QA models targeting on datasets mainly containing either short or long documents, our multi-step coarse to fine model takes the merits from multiple system modules, which can handle both short and long documents. The system hence obtains a much better accuracy and faster trainings speed compared to the current state-of-the-art models. We test our model on four QA datasets, WIKEREADING, WIKIREADING LONG, CNN and SQuAD, and demonstrate 1.3%-1.7% accuracy improvements with 1.5x-3.4x training speed-ups in comparison to the baselines using state-of-the-art models.


2008 ◽  
Vol 34 (2) ◽  
pp. 225-255 ◽  
Author(s):  
Nianwen Xue

In this article we report work on Chinese semantic role labeling, taking advantage of two recently completed corpora, the Chinese PropBank, a semantically annotated corpus of Chinese verbs, and the Chinese Nombank, a companion corpus that annotates the predicate-argument structure of nominalized predicates. Because the semantic role labels are assigned to the constituents in a parse tree, we first report experiments in which semantic role labels are automatically assigned to hand-crafted parses in the Chinese Treebank. This gives us a measure of the extent to which semantic role labels can be bootstrapped from the syntactic annotation provided in the treebank. We then report experiments using automatic parses with decreasing levels of human annotation in the input to the syntactic parser: parses that use gold-standard segmentation and POS-tagging, parses that use only gold-standard segmentation, and fully automatic parses. These experiments gauge how successful semantic role labeling for Chinese can be in more realistic situations. Our results show that when hand-crafted parses are used, semantic role labeling accuracy for Chinese is comparable to what has been reported for the state-of-the-art English semantic role labeling systems trained and tested on the English PropBank, even though the Chinese PropBank is significantly smaller in size. When an automatic parser is used, however, the accuracy of our system is significantly lower than the English state of the art. This indicates that an improvement in Chinese parsing is critical to high-performance semantic role labeling for Chinese.


Author(s):  
Mingwen Bi ◽  
Qingchuan Zhang ◽  
Min Zuo ◽  
Zelong Xu ◽  
Qingyu Jin

The intelligent question answering system aims to provide quick and concise feedback on the questions of users. Although the performance of phrase-level and numerous attention models have been improved, the sentence components and position information are not emphasized enough. This article combines Ci-Lin and word2vec to divide all of the words in the question-answer pairs into groups according to the semantics and select one kernel word in each group. The remaining words are common words and realize the semantic mapping mechanism between kernel words and common words. With this Chinese semantic mapping mechanism, the common words in all questions and answers are replaced by the semantic kernel words to realize the normalization of the semantic representation. Meanwhile, based on the bi-directional LSTM model, this article introduces a method of the combination of semantic role labeling and positional context, dividing the sentence into multiple semantic segments according to semantic logic. The weight is given to the neighboring words in the same semantic segment and propose semantic role labeling position attention based on the bi-directional LSTM model (BLSTM-SRLP). The good performance of the BLSTM-SRLP model has been demonstrated in comparative experiments on the food safety field dataset (FS-QA).


2021 ◽  
Author(s):  
Gabriele Scalia ◽  
Chiara Francalanci ◽  
Barbara Pernici

AbstractInformation extracted from social media has proven to be very useful in the domain of emergency management. An important task in emergency management is rapid crisis mapping, which aims to produce timely and reliable maps of affected areas. During an emergency, the volume of emergency-related posts is typically large, but only a small fraction is relevant and help rapid mapping effectively. Furthermore, posts are not useful for mapping purposes unless they are correctly geolocated and, on average, less than 2% of posts are natively georeferenced. This paper presents an algorithm, called CIME, that aims to identify and geolocate emergency-related posts that are relevant for mapping purposes. While native geocoordinates are most often missing, many posts contain geographical references in their metadata, such as texts or links that can be used by CIME to filter and geolocate information. In addition, social media creates a social network and each post can be enhanced with indirect information from the post’s network of relationships with other posts (for example, a retweet can be associated with other geographical references which are useful to geolocate the original tweet). To exploit all this information, CIME uses the concept of context, defined as the information characterizing a post both directly (the post’s metadata) and indirectly (the post’s network of relationships). The algorithm was evaluated on a recent major emergency event demonstrating better performance with respect to the state of the art in terms of total number of geolocated posts, geolocation accuracy and relevance for rapid mapping.


Sign in / Sign up

Export Citation Format

Share Document