Integrating natural language processing with image document analysis: what we learned from two real-world applications

Author(s):  
Jinying Chen ◽  
Huaigu Cao ◽  
Premkumar Natarajan
Diabetes ◽  
2019 ◽  
Vol 68 (Supplement 1) ◽  
pp. 1243-P
Author(s):  
JIANMIN WU ◽  
FRITHA J. MORRISON ◽  
ZHENXIANG ZHAO ◽  
XUANYAO HE ◽  
MARIA SHUBINA ◽  
...  

Author(s):  
John Carroll

This article introduces the concepts and techniques for natural language (NL) parsing, which signifies, using a grammar to assign a syntactic analysis to a string of words, a lattice of word hypotheses output by a speech recognizer or similar. The level of detail required depends on the language processing task being performed and the particular approach to the task that is being pursued. This article further describes approaches that produce ‘shallow’ analyses. It also outlines approaches to parsing that analyse the input in terms of labelled dependencies between words. Producing hierarchical phrase structure requires grammars that have at least context-free (CF) power. CF algorithms that are widely used in parsing of NL are described in this article. To support detailed semantic interpretation more powerful grammar formalisms are required, but these are usually parsed using extensions of CF parsing algorithms. Furthermore, this article describes unification-based parsing. Finally, it discusses three important issues that have to be tackled in real-world applications of parsing: evaluation of parser accuracy, parser efficiency, and measurement of grammar/parser coverage.


Author(s):  
Xiaoyu Lin ◽  
Yingxu Wang

Concept algebra (CA) is a denotational mathematics for formal knowledge manipulation and natural language processing. In order to explicitly demonstrate the mathematical models of formal concepts and their algebraic operations in CA, a simulation and visualization software is developed in the MATLAB environment known as the Visual Simulator of Concept Algebra (VSCA). This paper presents the design and implementation of VSCA and the theories underpinning its development. Visual simulations for the sets of reproductive and compositional operations of CA are demonstrated by real-world examples throughout the elaborations of CA and VSCA.


2020 ◽  
Vol 58 (7) ◽  
pp. 1227-1255
Author(s):  
Glenn Gordon Smith ◽  
Robert Haworth ◽  
Slavko Žitnik

We investigated how Natural Language Processing (NLP) algorithms could automatically grade answers to open-ended inference questions in web-based eBooks. This is a component of research on making reading more motivating to children and to increasing their comprehension. We obtained and graded a set of answers to open-ended questions embedded in a fiction novel written in English. Computer science students used a subset of the graded answers to develop algorithms designed to grade new answers to the questions. The algorithms utilized the story text, existing graded answers for a given question and publicly accessible databases in grading new responses. A computer science professor used another subset of the graded answers to evaluate the students’ NLP algorithms and to select the best algorithm. The results showed that the best algorithm correctly graded approximately 85% of the real-world answers as correct, partly correct, or wrong. The best NLP algorithm was trained with questions and graded answers from a series of new text narratives in another language, Slovenian. The resulting NLP algorithm model was successfully used in fourth-grade language arts classes for providing feedback to student answers on open-ended questions in eBooks.


2020 ◽  
Vol 23 (1) ◽  
pp. 21-26 ◽  
Author(s):  
Nemanja Vaci ◽  
Qiang Liu ◽  
Andrey Kormilitzin ◽  
Franco De Crescenzo ◽  
Ayse Kurtulmus ◽  
...  

BackgroundUtilisation of routinely collected electronic health records from secondary care offers unprecedented possibilities for medical science research but can also present difficulties. One key issue is that medical information is presented as free-form text and, therefore, requires time commitment from clinicians to manually extract salient information. Natural language processing (NLP) methods can be used to automatically extract clinically relevant information.ObjectiveOur aim is to use natural language processing (NLP) to capture real-world data on individuals with depression from the Clinical Record Interactive Search (CRIS) clinical text to foster the use of electronic healthcare data in mental health research.MethodsWe used a combination of methods to extract salient information from electronic health records. First, clinical experts define the information of interest and subsequently build the training and testing corpora for statistical models. Second, we built and fine-tuned the statistical models using active learning procedures.FindingsResults show a high degree of accuracy in the extraction of drug-related information. Contrastingly, a much lower degree of accuracy is demonstrated in relation to auxiliary variables. In combination with state-of-the-art active learning paradigms, the performance of the model increases considerably.ConclusionsThis study illustrates the feasibility of using the natural language processing models and proposes a research pipeline to be used for accurately extracting information from electronic health records.Clinical implicationsReal-world, individual patient data are an invaluable source of information, which can be used to better personalise treatment.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Julián Ramírez Sánchez ◽  
Alejandra Campo-Archbold ◽  
Andrés Zapata Rozo ◽  
Daniel Díaz-López ◽  
Javier Pastor-Galindo ◽  
...  

Among the myriad of applications of natural language processing (NLP), assisting law enforcement agencies (LEA) in detecting and preventing cybercrimes is one of the most recent and promising ones. The promotion of violence or hate by digital means is considered a cybercrime as it leverages the cyberspace to support illegal activities in the real world. The paper at hand proposes a solution that uses neural network (NN) based NLP to monitor suspicious activities in social networks allowing us to identify and prevent related cybercrimes. An LEA can find similar posts grouped in clusters, then determine their level of polarity, and identify a subset of user accounts that promote violent activities to be reviewed extensively as part of an effort to prevent crimes and specifically hostile social manipulation (HSM). Different experiments were also conducted to prove the feasibility of the proposal.


Sign in / Sign up

Export Citation Format

Share Document