scholarly journals A Prior Information Enhanced Extraction Framework for Document-level Financial Event Extraction

2021 ◽  
pp. 1-12
Author(s):  
Haitao Wang ◽  
Tong Zhu ◽  
Mingtao Wang ◽  
Guoliang Zhang ◽  
Wenliang Chen

Abstract Document-level financial event extraction (DFEE) is the task of detecting event and extracting the corresponding event arguments in financial documents, which plays an important role in information extraction in the financial domain. This task is challenging as the financial documents are generally long text and event arguments of one event may be scattered in different sentences. To address this issue, we propose a novel Prior Information Enhanced Extraction framework (PIEE) for DFEE, leveraging prior information from both event types and pre-trained language models. Specifically, PIEE consists of three components: event detection, event argument extraction, and event table filling. In event detection, we identify the event type. Then, the event type is explicitly used for event argument extraction. Meanwhile, the implicit information within language models also provides considerable cues for event arguments localization. Finally, all the event arguments are filled in an event table by a set of predefined heuristic rules. To demonstrate the effectiveness of our proposed framework, we participate the share task of CCKS2020 Task5-2: Document-level Event Arguments Extraction. On both Leaderboard A and Leaderboard B, PIEE takes the first place and significantly outperforms the other systems.

Electronics ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 824
Author(s):  
Peng Wang ◽  
Zhenkai Deng ◽  
Ruilong Cui

Extracting financial events from numerous financial announcements is very important for investors to make right decisions. However, it is still challenging that event arguments always scatter in multiple sentences in a financial announcement, while most existing event extraction models only work in sentence-level scenarios. To address this problem, this paper proposes a relation-aware Transformer-based Document-level Joint Event Extraction model (TDJEE), which encodes relations between words into the context and leverages modified Transformer to capture document-level information to fill event arguments. Meanwhile, the absence of labeled data in financial domain could lead models be unstable in extraction results, which is known as the cold start problem. Furthermore, a Fonduer-based knowledge base combined with the distant supervision method is proposed to simplify the event labeling and provide high quality labeled training corpus for model training and evaluating. Experimental results on real-world Chinese financial announcement show that, compared with other models, TDJEE achieves competitive results and can effectively extract event arguments across multiple sentences.


2021 ◽  
pp. 1-13
Author(s):  
Xia Li ◽  
Qinghua Wen ◽  
Zengtao Jiao ◽  
Jiangtao Zhang

Abstract The China Conference on Knowledge Graph and Semantic Computing (CCKS) 2020 Evaluation Task 3 presented clinical named entity recognition and event extraction for the Chinese electronic medical records. Two annotated data sets and some other additional resources for these two subtasks were provided for participators. This evaluation competition attracted 354 teams and 46 of them successfully submitted the valid results. The pre-trained language models are widely applied in this evaluation task. Data argumentation and external resources are also helpful.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Shi Wang ◽  
Zhujun Wang ◽  
Yi Jiang ◽  
Huayu Wang

In the event extraction task, considering that there may be multiple scenarios in the corpus and an argument may play different roles under different triggers, the traditional tagging scheme can only tag each word once, which cannot solve the problem of argument overlap. A hierarchical tagging pipeline model for Chinese corpus based on the pretrained model Bert was proposed, which can obtain the relevant arguments of each event in a hierarchical way. The pipeline structure is selected in the model, and the event extraction task is divided into event trigger classification and argument recognition. Firstly, the pretrained model Bert is used to generate the feature vector and transfer it to bidirectional gated recurrent unit+conditional random field (BiGRU+CRF) model for trigger classification; then, the marked event type features are spliced into the corpus as known features and then passed into BiGRU+CRF for argument recognition. We evaluated our method on DUEE, combined with data enhancement and mask operation. Experimental results show that our method is improved compared with other baselines, which prove the effectiveness of the model in Chinese corpus.


Serial Verbs ◽  
2018 ◽  
pp. 20-54
Author(s):  
Alexandra Y. Aikhenvald

A serial verb construction is a sequence of verbs which act together as a single predicate. Serial verbs are always monoclausal and are pronounced as a single verb would be. The components of a serial verb construction share tense, aspect, modality, reality status, evidentiality, mood, and also polarity values. A serial verb construction typically refers to what can be conceptualized as one event, and one recognizable event type, in terms of cultural stereotypes available to the speakers. Serial verbs tend to share at least one argument. An overwhelming majority of serial verbs have a single overall argument structure, with the subjects, objects and obliques belonging to the whole construction. In switch-function serial verb constructions, the O (or the recipient) of the first component is the same as the S (rarely, the A) of the second one. Event-argument and resultative serial verb constructions share no arguments.


2020 ◽  
Author(s):  
Zining Yang ◽  
Siyu Zhan ◽  
Mengshu Hou ◽  
Xiaoyang Zeng ◽  
Hao Zhu

The recent pre-trained language model has made great success in many NLP tasks. In this paper, we propose an event extraction system based on the novel pre-trained language model BERT to extract both event trigger and argument. As a deep-learningbased method, the size of the training dataset has a crucial impact on performance. To address the lacking training data problem for event extraction, we further train the pretrained language model with a carefully constructed in-domain corpus to inject event knowledge to our event extraction system with minimal efforts. Empirical evaluation on the ACE2005 dataset shows that injecting event knowledge can significantly improve the performance of event extraction.


2021 ◽  
Author(s):  
Amir Pouran Ben Veyseh ◽  
Minh Van Nguyen ◽  
Nghia Ngo Trung ◽  
Bonan Min ◽  
Thien Huu Nguyen

Author(s):  
Amir Pouran Ben Veyseh ◽  
Franck Dernoncourt ◽  
Quan Tran ◽  
Varun Manjunatha ◽  
Lidan Wang ◽  
...  

2020 ◽  
Vol 9 (12) ◽  
pp. 712
Author(s):  
Agung Dewandaru ◽  
Dwi Hendratmo Widyantoro ◽  
Saiful Akbar

Geoparser is a fundamental component of a Geographic Information Retrieval (GIR) geoparser, which performs toponym recognition, disambiguation, and geographic coordinate resolution from unstructured text domain. However, geoparsing of news articles which report several events across many place-mentions in the document are not yet adequately handled by regular geoparser, where the scope of resolution is either toponym-level or document-level. The capacity to detect multiple events and geolocate their true coordinates along with their numerical arguments is still missing from modern geoparsers, much less in Indonesian news corpora domain. We propose an event geoparser model with three stages of processing, which tightly integrates event extraction model into geoparsing and provides precise event-level resolution scope. The model casts the geotagging and event extraction as sequence labeling and uses LSTM-CRF inferencer equipped with features derived using Aggregated Topic Model from a large corpus to increase the generalizability. Throughout the proposed workflow and features, the geoparser is able to significantly improve the identification of pseudo-location entities, resulting in a 23.43% increase for weighted F1 score compared to baseline gazetteer and POS Tag features. As a side effect of event extraction, various numerical arguments are also extracted, and the output is easily projected to a rich choropleth map from a single news document.


Sign in / Sign up

Export Citation Format

Share Document