Trends and Features of the Applications of Natural Language Processing Techniques for Clinical Trials Text Analysis

Xieling Chen; Haoran Xie; Gary Cheng; Leonard K. M. Poon; Mingming Leng; Fu Lee Wang

doi:10.3390/app10062157

Trends and Features of the Applications of Natural Language Processing Techniques for Clinical Trials Text Analysis

Applied Sciences ◽

10.3390/app10062157 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2157 ◽

Cited By ~ 6

Author(s):

Xieling Chen ◽

Haoran Xie ◽

Gary Cheng ◽

Leonard K. M. Poon ◽

Mingming Leng ◽

...

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Analysis ◽

Research Field ◽

Test Analysis ◽

Clinical Trial Research ◽

Scientific Outputs ◽

Automatic Text Analysis

Natural language processing (NLP) is an effective tool for generating structured information from unstructured data, the one that is commonly found in clinical trial texts. Such interdisciplinary research has gradually grown into a flourishing research field with accumulated scientific outputs available. In this study, bibliographical data collected from Web of Science, PubMed, and Scopus databases from 2001 to 2018 had been investigated with the use of three prominent methods, including performance analysis, science mapping, and, particularly, an automatic text analysis approach named structural topic modeling. Topical trend visualization and test analysis were further employed to quantify the effects of the year of publication on topic proportions. Topical diverse distributions across prolific countries/regions and institutions were also visualized and compared. In addition, scientific collaborations between countries/regions, institutions, and authors were also explored using social network analysis. The findings obtained were essential for facilitating the development of the NLP-enhanced clinical trial texts processing, boosting scientific and technological NLP-enhanced clinical trial research, and facilitating inter-country/region and inter-institution collaborations.

Get full-text (via PubEx)

Application of Word Net for Text Analysis in Different Domains

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e9824.069520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 774-786

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Analysis ◽

The Other ◽

Web Browser ◽

Automatic Text Analysis ◽

Automatic Text

The following paper examines and illustrates various problems which occur in the field of Natural Language Processing. The solutions used in these papers use Word Net in one way or the other to enhance or improve the efficiency of the projects.Word Net can therefore be viewed as a combination and an augmentation of a word reference and a thesaurus. While it can be used by developers and programmers via a web browser, its prime use is in automatic text analysis and applications based on AI.

Get full-text (via PubEx)

Towards Accurate Deceptive Opinions Detection Based on Word Order-Preserving CNN

Mathematical Problems in Engineering ◽

10.1155/2018/2410206 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 4

Author(s):

Siyuan Zhao ◽

Zhiwei Xu ◽

Limin Liu ◽

Mengjie Guo ◽

Jing Yun

Keyword(s):

Neural Network ◽

Natural Language Processing ◽

Natural Language ◽

Convolutional Neural Network ◽

Language Processing ◽

Word Order ◽

Text Analysis ◽

Important Application ◽

Detection Mechanism ◽

Short Text

Convolutional neural network (CNN) has revolutionized the field of natural language processing, which is considerably efficient at semantics analysis that underlies difficult natural language processing problems in a variety of domains. The deceptive opinion detection is an important application of the existing CNN models. The detection mechanism based on CNN models has better self-adaptability and can effectively identify all kinds of deceptive opinions. Online opinions are quite short, varying in their types and content. In order to effectively identify deceptive opinions, we need to comprehensively study the characteristics of deceptive opinions and explore novel characteristics besides the textual semantics and emotional polarity that have been widely used in text analysis. In this paper, we optimize the convolutional neural network model by embedding the word order characteristics in its convolution layer and pooling layer, which makes convolutional neural network more suitable for short text classification and deceptive opinions detection. The TensorFlow-based experiments demonstrate that the proposed detection mechanism achieves more accurate deceptive opinion detection results.

Get full-text (via PubEx)

Natural language processing versus rule-based text analysis: Comparing BERT score and readability indices to predict crowdfunding outcomes

Journal of Business Venturing Insights ◽

10.1016/j.jbvi.2021.e00276 ◽

2021 ◽

Vol 16 ◽

pp. e00276

Author(s):

C.S. Richard Chan ◽

Charuta Pethe ◽

Steven Skiena

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Analysis ◽

Rule Based

Get full-text (via PubEx)

An approach to neural network analysis of text information in the economic assessment of companies

Economic Analysis Theory and Practice ◽

10.24891/ea.20.8.1574 ◽

2021 ◽

Vol 20 (8) ◽

pp. 1574-1594

Author(s):

Aleksandr R. NEVREDINOV

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Analysis ◽

Economic Assessment ◽

Management Decision ◽

Textual Information ◽

Financial Condition ◽

Analysis And Synthesis ◽

Management Decision Making

Subject. When evaluating enterprises, maximum accuracy and comprehensiveness of analysis are important, although the use of various indicators of organization’s financial condition and external factors provide a sufficiently high accuracy of forecasting. Many researchers are increasingly focusing on the natural language processing to analyze various text sources. This subject is extremely relevant against the needs of companies to quickly and extensively analyze their activities. Objectives. The study aims at exploring the natural language processing methods and sources of textual information about companies that can be used in the analysis, and developing an approach to the analysis of textual information. Methods. The study draws on methods of analysis and synthesis, systematization, formalization, comparative analysis, theoretical and methodological provisions contained in domestic and foreign scientific works on text analysis, including for purposes of company evaluation. Results. I offer and test an approach to using non-numeric indicators for company analysis. The paper presents a unique model, which is created on the basis of existing developments that have shown their effectiveness. I also substantiate the use of this approach to analyze a company’s condition and to include the analysis results in models for overall assessment of the state of companies. Conclusions. The findings improve scientific and practical understanding of techniques for the analysis of companies, the ways of applying text analysis, using machine learning. They can be used to support management decision-making to automate the analysis of their own and other companies in the market, with which they interact.

Get full-text (via PubEx)

Text Classification for Clinical Trial Operations: Evaluation and Comparison of Natural Language Processing Techniques

Therapeutic Innovation & Regulatory Science ◽

10.1007/s43441-020-00236-x ◽

2020 ◽

Author(s):

Emma Richard ◽

Bhargava Reddy

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Classification ◽

Processing Techniques

Get full-text (via PubEx)

Leveraging Python to Process Cross-Cultural Temperament Interviews: A Novel Platform for Text Analysis

Journal of Cross-Cultural Psychology ◽

10.1177/0022022120906478 ◽

2020 ◽

Vol 51 (2) ◽

pp. 168-181 ◽

Cited By ~ 1

Author(s):

Joshua J. Underwood ◽

Cornelia Kirchhoff ◽

Haven Warwick ◽

Maria A. Gartstein

Keyword(s):

Early Childhood ◽

Natural Language Processing ◽

Individual Differences ◽

Natural Language ◽

Language Processing ◽

Data Reduction ◽

Text Analysis ◽

Cross Cultural ◽

Two Samples ◽

Do So

During childhood, parents represent the most commonly used source of their child’s temperament information and, typically, do so by responding to questionnaires. Despite their wide-ranging applications, interviews present notorious data reduction challenges, as quantification of narratives has proven to be a labor-intensive process. However, for the purposes of this study, the labor-intensive nature may have conferred distinct advantages. The present study represents a demonstration project aimed at leveraging emerging technologies for this purpose. Specifically, we used Python natural language processing capabilities to analyze semistructured temperament interviews conducted with U.S. and German mothers of toddlers, expecting to identify differences between these two samples in the frequency of words used to describe individual differences, along with some similarities. Two different word lists were used: (a) a set of German personality words and (b) temperament-related words extracted from the Early Childhood Behavior Questionnaire (ECBQ). Analyses using the German trait word demonstrated that mothers from Germany described their toddlers as significantly more “cheerful” and “careful” compared with U.S. caregivers. According to U.S. mothers, their children were more “independent,” “emotional,” and “timid.” For the ECBQ analysis, German mothers described their children as “calm” and “careful” more often than U.S. mothers. U.S. mothers, however, referred to their children as “upset,” “happy,” and “frustrated” more frequently than German caregivers. The Python code developed herein illustrates this software as a viable research tool for cross-cultural investigations.

Get full-text (via PubEx)

How You Say It Matters: Text Analysis of FOMC Statements Using Natural Language Processing

The Federal Reserve Bank of Kansas City Economic Review ◽

10.18651/er/v106n1dohkimyang ◽

2021 ◽

Author(s):

Taeyoung Doh ◽

Sungil Kim ◽

Shu-Kuei X. Yang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Analysis

Get full-text (via PubEx)

Latent Dirichlet Allocation in predicting clinical trial terminations

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0973-y ◽

2019 ◽

Vol 19 (1) ◽

Author(s):

Simon Geletta ◽

Lendie Follett ◽

Marcia Laugerman

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Latent Dirichlet Allocation ◽

Structured Data ◽

Unstructured Data ◽

Future Research ◽

Funding Agencies ◽

Dirichlet Allocation

Abstract Background This study used natural language processing (NLP) and machine learning (ML) techniques to identify reliable patterns from within research narrative documents to distinguish studies that complete successfully, from the ones that terminate. Recent research findings have reported that at least 10 % of all studies that are funded by major research funding agencies terminate without yielding useful results. Since it is well-known that scientific studies that receive funding from major funding agencies are carefully planned, and rigorously vetted through the peer-review process, it was somewhat daunting to us that study-terminations are this prevalent. Moreover, our review of the literature about study terminations suggested that the reasons for study terminations are not well understood. We therefore aimed to address that knowledge gap, by seeking to identify the factors that contribute to study failures. Method We used data from the clinicialTrials.gov repository, from which we extracted both structured data (study characteristics), and unstructured data (the narrative description of the studies). We applied natural language processing techniques to the unstructured data to quantify the risk of termination by identifying distinctive topics that are more frequently associated with trials that are terminated and trials that are completed. We used the Latent Dirichlet Allocation (LDA) technique to derive 25 “topics” with corresponding sets of probabilities, which we then used to predict study-termination by utilizing random forest modeling. We fit two distinct models – one using only structured data as predictors and another model with both structured data and the 25 text topics derived from the unstructured data. Results In this paper, we demonstrate the interpretive and predictive value of LDA as it relates to predicting clinical trial failure. The results also demonstrate that the combined modeling approach yields robust predictive probabilities in terms of both sensitivity and specificity, relative to a model that utilizes the structured data alone. Conclusions Our study demonstrated that the use of topic modeling using LDA significantly raises the utility of unstructured data in better predicating the completion vs. termination of studies. This study sets the direction for future research to evaluate the viability of the designs of health studies.

Get full-text (via PubEx)

Deep learning approach to text analysis for human emotion detection from big data

Journal of Intelligent Systems ◽

10.1515/jisys-2022-0001 ◽

2022 ◽

Vol 31 (1) ◽

pp. 113-126

Author(s):

Jia Guo

Keyword(s):

Big Data ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Analysis ◽

Question Answering ◽

Word Embeddings ◽

Emotion Detection ◽

Human Emotion

Abstract Emotional recognition has arisen as an essential field of study that can expose a variety of valuable inputs. Emotion can be articulated in several means that can be seen, like speech and facial expressions, written text, and gestures. Emotion recognition in a text document is fundamentally a content-based classification issue, including notions from natural language processing (NLP) and deep learning fields. Hence, in this study, deep learning assisted semantic text analysis (DLSTA) has been proposed for human emotion detection using big data. Emotion detection from textual sources can be done utilizing notions of Natural Language Processing. Word embeddings are extensively utilized for several NLP tasks, like machine translation, sentiment analysis, and question answering. NLP techniques improve the performance of learning-based methods by incorporating the semantic and syntactic features of the text. The numerical outcomes demonstrate that the suggested method achieves an expressively superior quality of human emotion detection rate of 97.22% and the classification accuracy rate of 98.02% with different state-of-the-art methods and can be enhanced by other emotional word embeddings.

Get full-text (via PubEx)

A natural language processing tool for automatic identification of new disease and disease progression: Parsing text in multi-institutional radiology reports to facilitate clinical trial eligibility screening.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.1555 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. 1555-1555

Author(s):

Eric J. Clayton ◽

Imon Banerjee ◽

Patrick J. Ward ◽

Maggie D Howell ◽

Beth Lohmueller ◽

...

Keyword(s):

Clinical Trial ◽

Clinical Trials ◽

Natural Language Processing ◽

Natural Language ◽

Disease Progression ◽

Language Processing ◽

Free Text ◽

New Disease ◽

Radiology Reports ◽

Precision And Accuracy

1555 Background: Screening every patient for clinical trials is time-consuming, costly and inefficient. Developing an automated method for identifying patients who have potential disease progression, at the point where the practice first receives their radiology reports, but prior to the patient’s office visit, would greatly increase the efficiency of clinical trial operations and likely result in more patients being offered trial opportunities. Methods: Using Natural Language Processing (NLP) methodology, we developed a text parsing algorithm to automatically extract information about potential new disease or disease progression from multi-institutional, free-text radiology reports (CT, PET, bone scan, MRI or x-ray). We combined semantic dictionary mapping and machine learning techniques to normalize the linguistic and formatting variations in the text, training the XGBoost model particularly to achieve a high precision and accuracy to satisfy clinical trial screening requirements. In order to be comprehensive, we enhanced the model vocabulary using a multi-institutional dataset which includes reports from two academic institutions. Results: A dataset of 732 de-identified radiology reports were curated (two MDs agreed on potential new disease/dz progression vs stable) and the model was repeatedly re-trained for each fold where the folds were randomly selected. The final model achieved consistent precision (>0.87 precision) and accuracy (>0.87 accuracy). See the table for a summary of the results, by radiology report type. We are continuing work on the model to validate accuracy and precision using a new and unique set of reports. Conclusions: NLP systems can be used to identify patients who potentially have suffered new disease or disease progression and reduce the human effort in screening or clinical trials. Efforts are ongoing to integrate the NLP process into existing EHR reporting. New imaging reports sent via interface to the EHR will be extracted daily using a database query and will be provided via secure electronic transport to the NLP system. Patients with higher likelihood of disease progression will be automatically identified, and their reports routed to the clinical trials office for clinical trial screening parallel to physician EHR mailbox reporting. The over-arching goal of the project is to increase clinical trial enrollment. 5-fold cross-validation performance of the NLP model in terms of accuracy, precision and recall averaged across all the folds.[Table: see text]

Get full-text (via PubEx)