QA4IE: A Question Answering Based Framework for Information Extraction

Entity and relation recognition, i.e. assigning semantic classes (e.g., person, organization and location) to entities in a given sentence and determining the relations (e.g., born-in and employee-of) that hold between the corresponding entities, is an important task in areas such as information extraction (IE) (Califf and Mooney, 1999; Chinchor, 1998; Freitag, 2000; Roth and Yih, 2001), question answering (QA) (Voorhees, 2000; Changki Lee et al., 2007) and story comprehension (Hirschman et al., 1999). In a QA system, many questions ask for the specific entities involved in some relations. For example, the question that “Where was Poe born?” in TREC-9 asks for the location entity in which Poe was born. In a typical IE extraction task such as constructing a jobs database from unstructured text, the system has to extract many meaning entities like title and salary, ideally, to determine whether the entities are associated with the same position.

Download Full-text

A question answering system supported by information extraction

10.3115/974147.974170 ◽

2000 ◽

Cited By ~ 21

Author(s):

Rohini Srihari ◽

Wei Li

Keyword(s):

Information Extraction ◽

Question Answering ◽

Question Answering System

Download Full-text

Building Graph for Events and Time in Natural Language Text

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8419.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 581-586

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Language Processing ◽

Question Answering ◽

Relation Extraction ◽

Event Extraction ◽

Event Time ◽

Time Graph ◽

Question Answering Systems

Events and time are two major key terms in natural language processing due to the various event-oriented tasks these are become an essential terms in information extraction. In natural language processing and information extraction or retrieval event and time leads to several applications like text summaries, documents summaries, and question answering systems. In this paper, we present events-time graph as a new way of construction for event-time based information from text. In this event-time graph nodes are events, whereas edges represent the temporal and co-reference relations between events. In many of the previous researches of natural language processing mainly individually focused on extraction tasks and in domain-specific way but in this work we present extraction and representation of the relationship between events- time by representing with event time graph construction. Our overall system construction is in three-step process that performs event extraction, time extraction, and representing relation extraction. Each step is at a performance level comparable with the state of the art. We present Event extraction on MUC data corpus annotated with events mentions on which we train and evaluate our model. Next, we present time extraction the model of times tested for several news articles from Wikipedia corpus. Next is to represent event time relation by representation by next constructing event time graphs. Finally, we evaluate the overall quality of event graphs with the evaluation metrics and conclude the observations of the entire work

Download Full-text

Question Answering and Information Extraction from Texts

Advances in Intelligent Systems ◽

10.1007/978-94-011-4840-5_11 ◽

1999 ◽

pp. 121-130

Author(s):

J. Kontos ◽

I. Malagardi

Keyword(s):

Information Extraction ◽

Question Answering

Download Full-text

RPT

Proceedings of the VLDB Endowment ◽

10.14778/3457390.3457391 ◽

2021 ◽

Vol 14 (8) ◽

pp. 1254-1261

Author(s):

Nan Tang ◽

Ju Fan ◽

Fangyi Li ◽

Jianhong Tu ◽

Xiaoyong Du ◽

...

Keyword(s):

Information Extraction ◽

Question Answering ◽

Data Cleaning ◽

Schema Matching ◽

Data Preparation ◽

Denoising Autoencoder ◽

Data Annotation ◽

Hard Data ◽

Wide Range ◽

Collaborative Training

Can AI help automate human-easy but computer-hard data preparation tasks that burden data scientists, practitioners, and crowd workers? We answer this question by presenting RPT, a denoising autoencoder for tuple-to-X models (" X " could be tuple, token, label, JSON, and so on). RPT is pre-trained for a tuple-to-tuple model by corrupting the input tuple and then learning a model to reconstruct the original tuple. It adopts a Transformer-based neural translation architecture that consists of a bidirectional encoder (similar to BERT) and a left-to-right autoregressive decoder (similar to GPT), leading to a generalization of both BERT and GPT. The pre-trained RPT can already support several common data preparation tasks such as data cleaning, auto-completion and schema matching. Better still, RPT can be fine-tuned on a wide range of data preparation tasks, such as value normalization, data transformation, data annotation, etc. To complement RPT, we also discuss several appealing techniques such as collaborative training and few-shot learning for entity resolution, and few-shot learning and NLP question-answering for information extraction. In addition, we identify a series of research opportunities to advance the field of data preparation.

Download Full-text

Building an Information Extraction and Question Answering Model for Text Based on the Human Brain Process

Polibits ◽

10.17562/pb-57-10 ◽

2018 ◽

Vol 57 ◽

pp. 89-92

Author(s):

F. A. K. Hemant

Keyword(s):

Human Brain ◽

Information Extraction ◽

Question Answering ◽

Brain Process

Download Full-text

Adapting Open Information Extraction to Domain-Specific Relations

AI Magazine ◽

10.1609/aimag.v31i3.2305 ◽

2010 ◽

Vol 31 (3) ◽

pp. 93 ◽

Cited By ~ 23

Author(s):

Stephen Soderland ◽

Brendan Roof ◽

Bo Qin ◽

Shi Xu ◽

Mausam ◽

...

Keyword(s):

Information Extraction ◽

Question Answering ◽

Free Text ◽

New Paradigm ◽

Target Domain ◽

Domain Specific ◽

Text Corpora ◽

Open Information Extraction ◽

Training Examples ◽

Domain Independent

Information extraction (IE) can identify a set of relations from free text to support question answering (QA). Until recently, IE systems were domain-specific and needed a combination of manual engineering and supervised learning to adapt to each target domain. A new paradigm, Open IE operates on large text corpora without any manual tagging of relations, and indeed without any pre-specified relations. Due to its open-domain and open-relation nature, Open IE is purely textual and is unable to relate the surface forms to an ontology, if known in advance. We explore the steps needed to adapt Open IE to a domain-specific ontology and demonstrate our approach of mapping domain-independent tuples to an ontology using domains from DARPA’s Machine Reading Project. Our system achieves precision over 0.90 from as few as 8 training examples for an NFL-scoring domain.

Download Full-text

Information Extraction from Web as Knowledge Resources for Indonesian Question Answering System

Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019) ◽

10.2991/aisr.k.200424.064 ◽

2020 ◽

Author(s):

Abdiansah ABDIANSAH ◽

Alvi Syahrini UTAMI

Keyword(s):

Information Extraction ◽

Question Answering ◽

Knowledge Resources ◽

Question Answering System

Download Full-text

Temporal Expressions in Polish Corpus KPWr

Cognitive Studies | Études cognitives ◽

10.11649/cs.2015.020 ◽

2015 ◽

pp. 293-317

Author(s):

Jan Kocoń ◽

Michał Marcińczuk ◽

Marcin Oleksy ◽

Tomasz Bernaś ◽

Michał Wolski

Keyword(s):

Discourse Analysis ◽

Information Extraction ◽

Question Answering ◽

State Of The Art ◽

Event Recognition ◽

Event Identification ◽

Event Description ◽

University Of Technology ◽

Temporal Expressions ◽

Source Of Information

Temporal Expressions in Polish Corpus KPWrThis article presents the result of the recent research in the interpretation of Polish expressions that refer to time. These expressions are the source of information when something happens, how often something occurs or how long something lasts. Temporal information, which can be extracted from text automatically, plays significant role in many information extraction systems, such as question answering, discourse analysis, event recognition and many more. We prepared PLIMEX — a broad description of Polish temporal expressions with annotation guidelines, based on the state-of-the-art solutions for English, mainly TimeML specification. We also adapted the solution to capture the local semantics of temporal expressions, called LTIMEX. Temporal description also supports further event identification and extends event description model, focusing at anchoring events in time, ordering events and reasoning about the persistence of events. We prepared the specification, which is designed to address these issues and we annotated all documents in Polish Corpus of Wroclaw University of Technology (KPWr) using our annotation guidelines.

Download Full-text