Patient-oriented natural language processing: Defining a new paradigm for research and development to facilitate adoption and utilization by medical experts (Preprint)

UNSTRUCTURED The capabilities of natural language processing (NLP) methods have expanded significantly in recent years, particularly driven by advances in data science and machine learning. However, the utilization of NLP for patient-oriented clinical research and care (POCRC) is still limited. A primary reason behind this is perhaps the fact that clinical NLP methods are developed, optimized, and evaluated on narrow-focus datasets and tasks (e.g., for the detection of specific symptoms from free texts). Such research and development (R&D) approaches may be described as problem-oriented, and the developed systems only perform well for a given specialized task. As standalone systems, they are also typically not suitable for addressing the needs of POCRC, leaving a gap between the capabilities of clinical NLP methods and the needs of patient-facing medical experts. We believe that to make clinical NLP systems more valuable, future R&D efforts need to follow a new research paradigm, one that explicitly incorporates characteristics that are crucial for POCRC. We present our viewpoint about four interrelated characteristics, three representing NLP system properties and one associated with the R&D process—(i) generalizability (capability to characterize patients, not clinical problems), (ii) interpretability (ability to explain system decisions), (iii) customizability (flexibility for adaptation to distinct settings, problems and cohorts), and (iv) cross-evaluation (validated performance on heterogeneous datasets)—that are relevant for NLP systems suitable for POCRC. Using the NLP task of clinical concept detection as an example, we detail these characteristics and discuss how they may lead to increased uptake of NLP systems for POCRC.

Download Full-text

The Rise of Big Data Science: A Survey of Techniques, Methods and Approaches in the Field of Natural Language Processing and Network Theory

Big Data and Cognitive Computing ◽

10.3390/bdcc2030022 ◽

2018 ◽

Vol 2 (3) ◽

pp. 22 ◽

Cited By ~ 3

Author(s):

Jeffrey Ray ◽

Olayinka Johnny ◽

Marcello Trovati ◽

Stelios Sotiriadis ◽

Nik Bessis

Keyword(s):

Big Data ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Network Theory ◽

Data Science ◽

Theoretical Foundation ◽

Scientific Field ◽

Research Challenges ◽

New Research

The continuous creation of data has posed new research challenges due to its complexity, diversity and volume. Consequently, Big Data has increasingly become a fully recognised scientific field. This article provides an overview of the current research efforts in Big Data science, with particular emphasis on its applications, as well as theoretical foundation.

Download Full-text

Content Analysis of Textbooks via Natural Language Processing: Findings on Gender, Race, and Ethnicity in Texas U.S. History Textbooks

AERA Open ◽

10.1177/2332858420940312 ◽

2020 ◽

Vol 6 (3) ◽

pp. 233285842094031

Author(s):

Li Lucy ◽

Dorottya Demszky ◽

Patricia Bromley ◽

Dan Jurafsky

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Race And Ethnicity ◽

Data Science ◽

Black People ◽

History Textbooks ◽

Word Embeddings ◽

Representation Of Women ◽

New Research

Cutting-edge data science techniques can shed new light on fundamental questions in educational research. We apply techniques from natural language processing (lexicons, word embeddings, topic models) to 15 U.S. history textbooks widely used in Texas between 2015 and 2017, studying their depiction of historically marginalized groups. We find that Latinx people are rarely discussed, and the most common famous figures are nearly all White men. Lexicon-based approaches show that Black people are described as performing actions associated with low agency and power. Word embeddings reveal that women tend to be discussed in the contexts of work and the home. Topic modeling highlights the higher prominence of political topics compared with social ones. We also find that more conservative counties tend to purchase textbooks with less representation of women and Black people. Building on a rich tradition of textbook analysis, we release our computational toolkit to support new research directions.

Download Full-text

Patient-oriented natural language processing: Defining a new paradigm for research and development to facilitate adoption and utilization by medical experts (Preprint)

JMIR Medical Informatics ◽

10.2196/18471 ◽

2020 ◽

Author(s):

Abeed Sarker ◽

Mohammed Ali Al-Garadi ◽

Yuan-Chi Yang ◽

Jinho Choi ◽

Arshed A Quyyumi ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Research And Development ◽

Language Processing ◽

New Paradigm

Download Full-text

Report on the 4th Joint Workshop on Bibliometric-Enhanced Information Retrieval and Natural Language Processing for Digital Libraries at SIGIR 2019

ACM SIGIR Forum ◽

10.1145/3458553.3458554 ◽

2019 ◽

Vol 53 (2) ◽

pp. 3-10

Author(s):

Muthu Kumar Chandrasekaran ◽

Philipp Mayr

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Research And Development ◽

Language Processing ◽

Digital Libraries ◽

State Of The Art ◽

Shared Task ◽

Processing Information ◽

Joint Workshop

The 4 th joint BIRNDL workshop was held at the 42nd ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2019) in Paris, France. BIRNDL 2019 intended to stimulate IR researchers and digital library professionals to elaborate on new approaches in natural language processing, information retrieval, scientometrics, and recommendation techniques that can advance the state-of-the-art in scholarly document understanding, analysis, and retrieval at scale. The workshop incorporated different paper sessions and the 5 th edition of the CL-SciSumm Shared Task.

Download Full-text

Advanced Well Planning Using Natural Language Processing NLP and Data Science Models: Maximizing the Value of Data to Mitigate Costs and Risks in New Wells

10.2118/203280-ms ◽

2020 ◽

Author(s):

John Cumming ◽

Valentina Riggins ◽

Paul Hodson ◽

Barry Walker

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Science ◽

Science Models

Download Full-text

Business Sentiment Quotient Analysis using Natural Language Processing

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.d8721.049420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 1350-1352

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Embedding Technique ◽

Computer Scientists ◽

Online Business ◽

New Research ◽

Python Programming ◽

Market Requirement

Online business has opened up several avenues for researchers and computer scientists to initiate new research models. The business activities that the customers accomplish certainly produce abundant information /data. Analysis of the data/information will obviously produce useful inferences and many declarations. These inferences may support the system in improving the quality of service, understand the current market requirement, Trend of the business, future need of the society and so on. In this connection the current paper is trying to propose a feature extraction technique named as Business Sentiment Quotient (BSQ). BSQ involves word2vec[1] word embedding technique from Natural Language Processing. Number of tweets related to business are accessed from twitter and processed to estimate BSQ using python programming language. BSQ may be utilized for further Machine Learning Activities.

Download Full-text

Research and development in natural language processing at BBN laboratories in the strategic computing program

10.3115/1077146.1077148 ◽

1986 ◽

Author(s):

Ralph Weischedel ◽

David Stallard ◽

Remko Scha ◽

Edward Walker ◽

Damaris Ayuso ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Research And Development ◽

Language Processing

Download Full-text

Data Science and Natural Language Processing to Extract Information in Clinical Domain

10.1145/3493700.3493773 ◽

2022 ◽

Author(s):

V.G.Vinod Vydiswaran ◽

Xinyan Zhao ◽

Deahan Yu

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Science ◽

Clinical Domain ◽

Extract Information

Download Full-text

Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text

Yearbook of Medical Informatics ◽

10.15265/iy-2017-029 ◽

2017 ◽

Vol 26 (01) ◽

pp. 214-227 ◽

Cited By ~ 29

Author(s):

G. Gonzalez-Hernandez ◽

A. Sarker ◽

K. O’Connor ◽

G. Savova

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Science ◽

Open Systems ◽

Text Processing ◽

Research Progress ◽

Research Problems ◽

Health Related

Summary Background: Natural Language Processing (NLP) methods are increasingly being utilized to mine knowledge from unstructured health-related texts. Recent advances in noisy text processing techniques are enabling researchers and medical domain experts to go beyond the information encapsulated in published texts (e.g., clinical trials and systematic reviews) and structured questionnaires, and obtain perspectives from other unstructured sources such as Electronic Health Records (EHRs) and social media posts. Objectives: To review the recently published literature discussing the application of NLP techniques for mining health-related information from EHRs and social media posts. Methods: Literature review included the research published over the last five years based on searches of PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers. We particularly focused on the techniques employed on EHRs and social media data. Results: A set of 62 studies involving EHRs and 87 studies involving social media matched our criteria and were included in this paper. We present the purposes of these studies, outline the key NLP contributions, and discuss the general trends observed in the field, the current state of research, and important outstanding problems. Conclusions: Over the recent years, there has been a continuing transition from lexical and rule-based systems to learning-based approaches, because of the growth of annotated data sets and advances in data science. For EHRs, publicly available annotated data is still scarce and this acts as an obstacle to research progress. On the contrary, research on social media mining has seen a rapid growth, particularly because the large amount of unlabeled data available via this resource compensates for the uncertainty inherent to the data. Effective mechanisms to filter out noise and for mapping social media expressions to standard medical concepts are crucial and latent research problems. Shared tasks and other competitive challenges have been driving factors behind the implementation of open systems, and they are likely to play an imperative role in the development of future systems.

Download Full-text

Data Science and Natural Language Processing to Extract Information from Clinical Narratives

8th ACM IKDD CODS and 26th COMAD ◽

10.1145/3430984.3431967 ◽

2020 ◽

Author(s):

V.G.Vinod Vydiswaran ◽

Xinyan Zhao ◽

Deahan Yu

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Science ◽

Extract Information

Download Full-text