Document-level Event Extraction with Efficient End-to-end Learning of Cross-event Dependencies

Abstract Document-level financial event extraction (DFEE) is the task of detecting event and extracting the corresponding event arguments in financial documents, which plays an important role in information extraction in the financial domain. This task is challenging as the financial documents are generally long text and event arguments of one event may be scattered in different sentences. To address this issue, we propose a novel Prior Information Enhanced Extraction framework (PIEE) for DFEE, leveraging prior information from both event types and pre-trained language models. Specifically, PIEE consists of three components: event detection, event argument extraction, and event table filling. In event detection, we identify the event type. Then, the event type is explicitly used for event argument extraction. Meanwhile, the implicit information within language models also provides considerable cues for event arguments localization. Finally, all the event arguments are filled in an event table by a set of predefined heuristic rules. To demonstrate the effectiveness of our proposed framework, we participate the share task of CCKS2020 Task5-2: Document-level Event Arguments Extraction. On both Leaderboard A and Leaderboard B, PIEE takes the first place and significantly outperforms the other systems.

Download Full-text

Variational Deep Logic Network for Joint Inference of Entities and Relations

Computational Linguistics ◽

10.1162/coli_a_00415 ◽

2021 ◽

pp. 1-38

Author(s):

Wenya Wang ◽

Sinno Jialin Pan

Keyword(s):

Deep Learning ◽

Representation Learning ◽

Event Extraction ◽

Relational Reasoning ◽

Entity Extraction ◽

Learning Models ◽

Joint Inference ◽

Logic Network ◽

End To End ◽

Relation Prediction

Abstract Nowadays, deep learning models have been widely adopted and achieved promising results on various application domains. Despite of their intriguing performance, most deep learning models function as black-boxes, lacking explicit reasoning capabilities and explanations, which are usually essential for complex problems. Take joint inference in information extraction as an example. This task requires the identification of multiple structured knowledge from texts, which is inter-correlated, including entities, events and the relationships between them. Various deep neural networks have been proposed to jointly perform entity extraction and relation prediction, which only propagate information implicitly via representation learning. However, they fail to encode the intensive correlations between entity types and relations to enforce their co-existence. On the other hand, some approaches adopt rules to explicitly constrain certain relational facts. However, the separation of rules with representation learning usually restrains the approaches with error propagation. Moreover, the pre-defined rules are inflexible and might bring negative effects when data is noisy. To address these limitations, we propose a variational deep logic network that incorporates both representation learning and relational reasoning via the variational EM algorithm. The model consists of a deep neural network to learn high-level features with implicit interactions via the self-attention mechanism and a relational logic network to explicitly exploit target interactions. These two components are trained interactively to bring the best of both worlds. We conduct extensive experiments ranging from fine-grained sentiment terms extraction, end-to-end relation prediction to end-to-end event extraction to demonstrate the effectiveness of our proposed method.

Download Full-text

Event Geoparser with Pseudo-Location Entity Identification and Numerical Argument Extraction Implementation and Evaluation in Indonesian News Domain

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi9120712 ◽

2020 ◽

Vol 9 (12) ◽

pp. 712

Author(s):

Agung Dewandaru ◽

Dwi Hendratmo Widyantoro ◽

Saiful Akbar

Keyword(s):

Topic Model ◽

Event Extraction ◽

Geographic Information Retrieval ◽

Unstructured Text ◽

Three Stages ◽

Entity Identification ◽

Choropleth Map ◽

Extraction Model ◽

Document Level ◽

Large Corpus

Geoparser is a fundamental component of a Geographic Information Retrieval (GIR) geoparser, which performs toponym recognition, disambiguation, and geographic coordinate resolution from unstructured text domain. However, geoparsing of news articles which report several events across many place-mentions in the document are not yet adequately handled by regular geoparser, where the scope of resolution is either toponym-level or document-level. The capacity to detect multiple events and geolocate their true coordinates along with their numerical arguments is still missing from modern geoparsers, much less in Indonesian news corpora domain. We propose an event geoparser model with three stages of processing, which tightly integrates event extraction model into geoparsing and provides precise event-level resolution scope. The model casts the geotagging and event extraction as sequence labeling and uses LSTM-CRF inferencer equipped with features derived using Aggregated Topic Model from a large corpus to increase the generalizability. Throughout the proposed workflow and features, the geoparser is able to significantly improve the identification of pseudo-location entities, resulting in a 23.43% increase for weighted F1 score compared to baseline gazetteer and POS Tag features. As a side effect of event extraction, various numerical arguments are also extracted, and the output is easily projected to a rich choropleth map from a single news document.

Download Full-text

TDJEE: A Document-Level Joint Model for Financial Event Extraction

Electronics ◽

10.3390/electronics10070824 ◽

2021 ◽

Vol 10 (7) ◽

pp. 824

Author(s):

Peng Wang ◽

Zhenkai Deng ◽

Ruilong Cui

Keyword(s):

Event Extraction ◽

Distant Supervision ◽

Sentence Level ◽

Financial Domain ◽

Level Information ◽

Financial Events ◽

Model Training ◽

Joint Event ◽

Extraction Model ◽

Document Level

Extracting financial events from numerous financial announcements is very important for investors to make right decisions. However, it is still challenging that event arguments always scatter in multiple sentences in a financial announcement, while most existing event extraction models only work in sentence-level scenarios. To address this problem, this paper proposes a relation-aware Transformer-based Document-level Joint Event Extraction model (TDJEE), which encodes relations between words into the context and leverages modified Transformer to capture document-level information to fill event arguments. Meanwhile, the absence of labeled data in financial domain could lead models be unstable in extraction results, which is known as the cold start problem. Furthermore, a Fonduer-based knowledge base combined with the distant supervision method is proposed to simplify the event labeling and provide high quality labeled training corpus for model training and evaluating. Experimental results on real-world Chinese financial announcement show that, compared with other models, TDJEE achieves competitive results and can effectively extract event arguments across multiple sentences.

Download Full-text

Exploring Sentence Community for Document-Level Event Extraction

10.18653/v1/2021.findings-emnlp.32 ◽

2021 ◽

Author(s):

Yusheng Huang ◽

Weijia Jia

Keyword(s):

Event Extraction ◽

Document Level

Download Full-text

DCFEE: A Document-level Chinese Financial Event Extraction System based on Automatically Labeled Training Data

10.18653/v1/p18-4009 ◽

2018 ◽

Cited By ~ 7

Author(s):

Hang Yang ◽

Yubo Chen ◽

Kang Liu ◽

Yang Xiao ◽

Jun Zhao

Keyword(s):

Event Extraction ◽

Training Data ◽

Extraction System ◽

Document Level

Download Full-text

LSTM-Based End-to-End Framework for Biomedical Event Extraction

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2019.2916346 ◽

2020 ◽

Vol 17 (6) ◽

pp. 2029-2039

Author(s):

Xinyi Yu ◽

Wenge Rong ◽

Jingshuang Liu ◽

Deyu Zhou ◽

Yuanxin Ouyang ◽

...

Keyword(s):

Event Extraction ◽

Biomedical Event Extraction ◽

End To End

Download Full-text

DeepEventMine: end-to-end neural nested event extraction from biomedical texts

Bioinformatics ◽

10.1093/bioinformatics/btaa540 ◽

2020 ◽

Vol 36 (19) ◽

pp. 4910-4917

Author(s):

Hai-Long Trieu ◽

Thy Thy Tran ◽

Khoa N A Duong ◽

Anh Nguyen ◽

Makoto Miwa ◽

...

Keyword(s):

Directed Acyclic Graph ◽

State Of The Art ◽

Event Extraction ◽

Supplementary Information ◽

Supplementary Data ◽

General Domain ◽

Acyclic Graph ◽

End To End ◽

Biomedical Texts ◽

Extraction Model

Abstract Motivation Recent neural approaches on event extraction from text mainly focus on flat events in general domain, while there are less attempts to detect nested and overlapping events. These existing systems are built on given entities and they depend on external syntactic tools. Results We propose an end-to-end neural nested event extraction model named DeepEventMine that extracts multiple overlapping directed acyclic graph structures from a raw sentence. On the top of the bidirectional encoder representations from transformers model, our model detects nested entities and triggers, roles, nested events and their modifications in an end-to-end manner without any syntactic tools. Our DeepEventMine model achieves the new state-of-the-art performance on seven biomedical nested event extraction tasks. Even when gold entities are unavailable, our model can detect events from raw text with promising performance. Availability and implementation Our codes and models to reproduce the results are available at: https://github.com/aistairc/DeepEventMine. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text