Integrating natural language processing with image document analysis: what we learned from two real-world applications

This article introduces the concepts and techniques for natural language (NL) parsing, which signifies, using a grammar to assign a syntactic analysis to a string of words, a lattice of word hypotheses output by a speech recognizer or similar. The level of detail required depends on the language processing task being performed and the particular approach to the task that is being pursued. This article further describes approaches that produce ‘shallow’ analyses. It also outlines approaches to parsing that analyse the input in terms of labelled dependencies between words. Producing hierarchical phrase structure requires grammars that have at least context-free (CF) power. CF algorithms that are widely used in parsing of NL are described in this article. To support detailed semantic interpretation more powerful grammar formalisms are required, but these are usually parsed using extensions of CF parsing algorithms. Furthermore, this article describes unification-based parsing. Finally, it discusses three important issues that have to be tackled in real-world applications of parsing: evaluation of parser accuracy, parser efficiency, and measurement of grammar/parser coverage.

Download Full-text

Semi-Automatic De-identification of Hospital Discharge Summaries with Natural Language Processing: A Case-Study of Performance and Real-World Usability

2017 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData) ◽

10.1109/ithings-greencom-cpscom-smartdata.2017.169 ◽

2017 ◽

Author(s):

Ioan Calapodescu ◽

David Rozier ◽

Svetlana Artemova ◽

Jean-Luc Bosson

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Hospital Discharge ◽

Language Processing ◽

Real World ◽

Discharge Summaries

Download Full-text

PNS266 LANDSCAPE ANALYSIS OF IMPACT OF MACHINE LEARNING, NATURAL LANGUAGE PROCESSING, ARTIFICIAL INTELLIGENCE AND BLOCKCHAIN TECHNOLOGY ON LEVERAGING REAL WORLD EVIDENCE (RWE)

Value in Health ◽

10.1016/j.jval.2019.04.1621 ◽

2019 ◽

Vol 22 ◽

pp. S332

Author(s):

M. Garg

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Real World ◽

Landscape Analysis ◽

Blockchain Technology ◽

Real World Evidence

Download Full-text

Simulation and Visualization of Concept Algebra in MATLAB

International Journal of Software Science and Computational Intelligence ◽

10.4018/ijssci.2014010103 ◽

2014 ◽

Vol 6 (1) ◽

pp. 30-55 ◽

Cited By ~ 7

Author(s):

Xiaoyu Lin ◽

Yingxu Wang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Mathematical Models ◽

Language Processing ◽

Real World ◽

Formal Knowledge ◽

Design And Implementation ◽

Visualization Software ◽

Formal Concepts ◽

Algebraic Operations

Concept algebra (CA) is a denotational mathematics for formal knowledge manipulation and natural language processing. In order to explicitly demonstrate the mathematical models of formal concepts and their algebraic operations in CA, a simulation and visualization software is developed in the MATLAB environment known as the Visual Simulator of Concept Algebra (VSCA). This paper presents the design and implementation of VSCA and the theories underpinning its development. Visual simulations for the sets of reproductive and compositional operations of CA are demonstrated by real-world examples throughout the elaborations of CA and VSCA.

Download Full-text

Computer Science Meets Education: Natural Language Processing for Automatic Grading of Open-Ended Questions in eBooks

Journal of Educational Computing Research ◽

10.1177/0735633120927486 ◽

2020 ◽

Vol 58 (7) ◽

pp. 1227-1255

Author(s):

Glenn Gordon Smith ◽

Robert Haworth ◽

Slavko Žitnik

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Arts ◽

Computer Science ◽

Language Processing ◽

Real World ◽

Fourth Grade ◽

Science Students ◽

Web Based ◽

Science Professor

We investigated how Natural Language Processing (NLP) algorithms could automatically grade answers to open-ended inference questions in web-based eBooks. This is a component of research on making reading more motivating to children and to increasing their comprehension. We obtained and graded a set of answers to open-ended questions embedded in a fiction novel written in English. Computer science students used a subset of the graded answers to develop algorithms designed to grade new answers to the questions. The algorithms utilized the story text, existing graded answers for a given question and publicly accessible databases in grading new responses. A computer science professor used another subset of the graded answers to evaluate the students’ NLP algorithms and to select the best algorithm. The results showed that the best algorithm correctly graded approximately 85% of the real-world answers as correct, partly correct, or wrong. The best NLP algorithm was trained with questions and graded answers from a series of new text narratives in another language, Slovenian. The resulting NLP algorithm model was successfully used in fourth-grade language arts classes for providing feedback to student answers on open-ended questions in eBooks.

Download Full-text

Natural language processing for structuring clinical text data on depression using UK-CRIS

Evidence-Based Mental Health ◽

10.1136/ebmental-2019-300134 ◽

2020 ◽

Vol 23 (1) ◽

pp. 21-26 ◽

Cited By ~ 6

Author(s):

Nemanja Vaci ◽

Qiang Liu ◽

Andrey Kormilitzin ◽

Franco De Crescenzo ◽

Ayse Kurtulmus ◽

...

Keyword(s):

Natural Language Processing ◽

Active Learning ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Real World ◽

Statistical Models ◽

Health Records ◽

Clinical Text ◽

Electronic Health

BackgroundUtilisation of routinely collected electronic health records from secondary care offers unprecedented possibilities for medical science research but can also present difficulties. One key issue is that medical information is presented as free-form text and, therefore, requires time commitment from clinicians to manually extract salient information. Natural language processing (NLP) methods can be used to automatically extract clinically relevant information.ObjectiveOur aim is to use natural language processing (NLP) to capture real-world data on individuals with depression from the Clinical Record Interactive Search (CRIS) clinical text to foster the use of electronic healthcare data in mental health research.MethodsWe used a combination of methods to extract salient information from electronic health records. First, clinical experts define the information of interest and subsequently build the training and testing corpora for statistical models. Second, we built and fine-tuned the statistical models using active learning procedures.FindingsResults show a high degree of accuracy in the extraction of drug-related information. Contrastingly, a much lower degree of accuracy is demonstrated in relation to auxiliary variables. In combination with state-of-the-art active learning paradigms, the performance of the model increases considerably.ConclusionsThis study illustrates the feasibility of using the natural language processing models and proposes a research pipeline to be used for accurately extracting information from electronic health records.Clinical implicationsReal-world, individual patient data are an invaluable source of information, which can be used to better personalise treatment.

Download Full-text

IDENTIFYING REASONS FOR STATIN NONADHERENCE IN A DIVERSE, REAL-WORLD POPULATION USING ELECTRONIC HEALTH RECORDS AND NATURAL LANGUAGE PROCESSING

Journal of the American College of Cardiology ◽

10.1016/s0735-1097(21)03021-7 ◽

2021 ◽

Vol 77 (18) ◽

pp. 1665

Author(s):

Ashish Sarraju ◽

Jean Coquet ◽

Antonia Chan ◽

Summer Ngo ◽

Juan Antonio Lossio-Ventura ◽

...

Keyword(s):

Natural Language Processing ◽

Electronic Health Records ◽

Natural Language ◽

Language Processing ◽

Real World ◽

World Population ◽

Health Records ◽

Electronic Health

Download Full-text

Uncovering Cybercrimes in Social Media through Natural Language Processing

Complexity ◽

10.1155/2021/7955637 ◽

2021 ◽

Vol 2021 ◽

pp. 1-15

Author(s):

Julián Ramírez Sánchez ◽

Alejandra Campo-Archbold ◽

Andrés Zapata Rozo ◽

Daniel Díaz-López ◽

Javier Pastor-Galindo ◽

...

Keyword(s):

Neural Network ◽

Social Networks ◽

Social Media ◽

Natural Language Processing ◽

Law Enforcement ◽

Natural Language ◽

Language Processing ◽

Real World ◽

Law Enforcement Agencies ◽

Illegal Activities

Among the myriad of applications of natural language processing (NLP), assisting law enforcement agencies (LEA) in detecting and preventing cybercrimes is one of the most recent and promising ones. The promotion of violence or hate by digital means is considered a cybercrime as it leverages the cyberspace to support illegal activities in the real world. The paper at hand proposes a solution that uses neural network (NN) based NLP to monitor suspicious activities in social networks allowing us to identify and prevent related cybercrimes. An LEA can find similar posts grouped in clusters, then determine their level of polarity, and identify a subset of user accounts that promote violent activities to be reviewed extensively as part of an effort to prevent crimes and specifically hostile social manipulation (HSM). Different experiments were also conducted to prove the feasibility of the proposal.

Download Full-text