scholarly journals scriptLattes: an open-source knowledge extraction system from the Lattes platform

2009 ◽  
Vol 15 (4) ◽  
pp. 31-39 ◽  
Author(s):  
Jesús Pascual Mena-Chalco ◽  
Roberto Marcondes Cesar Junior
2009 ◽  
Vol 15 (4) ◽  
pp. 31-39 ◽  
Author(s):  
Jesús Pascual Mena-Chalco ◽  
Roberto Marcondes Cesar Junior

2017 ◽  
Vol 24 (6) ◽  
pp. 1062-1071 ◽  
Author(s):  
Tian Kang ◽  
Shaodian Zhang ◽  
Youlan Tang ◽  
Gregory W Hruby ◽  
Alexander Rusanov ◽  
...  

Abstract Objective To develop an open-source information extraction system called Eligibility Criteria Information Extraction (EliIE) for parsing and formalizing free-text clinical research eligibility criteria (EC) following Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) version 5.0. Materials and Methods EliIE parses EC in 4 steps: (1) clinical entity and attribute recognition, (2) negation detection, (3) relation extraction, and (4) concept normalization and output structuring. Informaticians and domain experts were recruited to design an annotation guideline and generate a training corpus of annotated EC for 230 Alzheimer’s clinical trials, which were represented as queries against the OMOP CDM and included 8008 entities, 3550 attributes, and 3529 relations. A sequence labeling–based method was developed for automatic entity and attribute recognition. Negation detection was supported by NegEx and a set of predefined rules. Relation extraction was achieved by a support vector machine classifier. We further performed terminology-based concept normalization and output structuring. Results In task-specific evaluations, the best F1 score for entity recognition was 0.79, and for relation extraction was 0.89. The accuracy of negation detection was 0.94. The overall accuracy for query formalization was 0.71 in an end-to-end evaluation. Conclusions This study presents EliIE, an OMOP CDM–based information extraction system for automatic structuring and formalization of free-text EC. According to our evaluation, machine learning-based EliIE outperforms existing systems and shows promise to improve.


Author(s):  
Mouhcine El Hassani ◽  
Noureddine Falih ◽  
Belaid Bouikhalene

As information becomes increasingly abundant and accessible on the web, researchers do not have a need to go to excavate books in the libraries. These require a knowledge extraction system from the text (KEST). The goal of authors in this chapter is to identify the needs of a person to do a search in a text, which can be unstructured, and retrieve the terms of information related to the subject of research then structure them into classes of useful information. These may subsequently identify the general architecture of an information retrieval system from text documents in order to develop it and finally identify the parameters to evaluate its performance and the results retrieved.


2017 ◽  
Vol 10 ◽  
pp. 829-840
Author(s):  
Taniana Rodriguez ◽  
Jose Aguilar ◽  
Alexandra Gonzalez

Sign in / Sign up

Export Citation Format

Share Document