Identification of Adverse Drug Event–Related Japanese Articles: Natural Language Processing Analysis (Preprint)

Mapping Intimacies ◽

10.2196/preprints.22661 ◽

2020 ◽

Author(s):

Shogo Ujiie ◽

Shuntaro Yada ◽

Shoko Wakamiya ◽

Eiji Aramaki

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Drug Safety ◽

Automated System ◽

Pharmaceutical Companies ◽

Manual Labor ◽

Sentence Level ◽

Medical Articles ◽

Document Level

BACKGROUND Medical articles covering adverse drug events (ADEs) are systematically reported by pharmaceutical companies for drug safety information purposes. Although policies governing reporting to regulatory bodies vary among countries and regions, all medical article reporting may be categorized as precision or recall based. Recall-based reporting, which is implemented in Japan, requires the reporting of any possible ADE. Therefore, recall-based reporting can introduce numerous false negatives or substantial amounts of noise, a problem that is difficult to address using limited manual labor. OBJECTIVE Our aim was to develop an automated system that could identify ADE-related medical articles, support recall-based reporting, and alleviate manual labor in Japanese pharmaceutical companies. METHODS Using medical articles as input, our system based on natural language processing applies document-level classification to extract articles containing ADEs (replacing manual labor in the first screening) and sentence-level classification to extract sentences within those articles that imply ADEs (thus supporting experts in the second screening). We used 509 Japanese medical articles annotated by a medical engineer to evaluate the performance of the proposed system. RESULTS Document-level classification yielded an F1 of 0.903. Sentence-level classification yielded an F1 of 0.413. These were averages of fivefold cross-validations. CONCLUSIONS A simple automated system may alleviate the manual labor involved in screening drug safety–related medical articles in pharmaceutical companies. After improving the accuracy of the sentence-level classification by considering a wider context, we intend to apply this system toward real-world postmarketing surveillance.

Download Full-text

Identification of Adverse Drug Event–Related Japanese Articles: Natural Language Processing Analysis

JMIR Medical Informatics ◽

10.2196/22661 ◽

2020 ◽

Vol 8 (11) ◽

pp. e22661

Author(s):

Shogo Ujiie ◽

Shuntaro Yada ◽

Shoko Wakamiya ◽

Eiji Aramaki

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Drug Safety ◽

Automated System ◽

Pharmaceutical Companies ◽

Manual Labor ◽

Sentence Level ◽

Medical Articles ◽

Document Level

Background Medical articles covering adverse drug events (ADEs) are systematically reported by pharmaceutical companies for drug safety information purposes. Although policies governing reporting to regulatory bodies vary among countries and regions, all medical article reporting may be categorized as precision or recall based. Recall-based reporting, which is implemented in Japan, requires the reporting of any possible ADE. Therefore, recall-based reporting can introduce numerous false negatives or substantial amounts of noise, a problem that is difficult to address using limited manual labor. Objective Our aim was to develop an automated system that could identify ADE-related medical articles, support recall-based reporting, and alleviate manual labor in Japanese pharmaceutical companies. Methods Using medical articles as input, our system based on natural language processing applies document-level classification to extract articles containing ADEs (replacing manual labor in the first screening) and sentence-level classification to extract sentences within those articles that imply ADEs (thus supporting experts in the second screening). We used 509 Japanese medical articles annotated by a medical engineer to evaluate the performance of the proposed system. Results Document-level classification yielded an F1 of 0.903. Sentence-level classification yielded an F1 of 0.413. These were averages of fivefold cross-validations. Conclusions A simple automated system may alleviate the manual labor involved in screening drug safety–related medical articles in pharmaceutical companies. After improving the accuracy of the sentence-level classification by considering a wider context, we intend to apply this system toward real-world postmarketing surveillance.

Download Full-text

Ανάλυση συναισθήματος και γνώμης

10.12681/eadd/44752 ◽

2018 ◽

Author(s):

Αγγελική-Σπυριδούλα Βλαχοστέργιου

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Context Aware ◽

Sentence Level ◽

Document Level

Τα τελευταία χρόνια έχει παρατηρηθεί μια αύξηση του αριθμού των προσπαθειών για την αυτόματη αναγνώριση και κατηγοριοποίηση του ανθρωπίνου συναισθήματος χρησιμοποιώντας σήματα φυσιολογίας, σήματα από το πρόσωπο, τη φωνή, καθώς επίσης και προσωπικές ερμηνείες από κείμενα μεγάλων κοινωνικών δεδομένων. Αρκετοί είναι οι τομείς της έρευνας που θα μπορούσαν να επωφεληθούν από αυτά τα συστήματα: διαδραστικά συστήματα διδασκαλίας, τα οποία να επιτρέπουν στους εκπαιδευτικούς να γνωρίζουν το άγχος των φοιτητών, πρόληψη των ατυχημάτων (π.χ. εντοπισμός της κόπωσης του οδηγού), στρατιωτικά ομαδικά καθήκοντα που χαρακτηρίζονται από μεγάλης διάρκειας περιόδους άγχους και πίεσης και εφαρμογές στον τομέα της Υγείας για την έγκαιρη διάγνωση νευροεκφυλιστικών νόσων (π.χ. νόσος του Πάρκινσον), όπου η εκδήλωση των συμπτωμάτων συμβαίνει πολλά χρόνια μετά την έναρξη του νευροεκφυλισμού.Ωστόσο, παρά τις μέχρι τώρα ερευνητικές προσπάθειες, δεν έχει επιτευχθεί ο μακροπρόθεσμος στόχος της δημιουργίας ενός ισχυρού πλαισίου αναγνώρισης του εξεταζόμενου τομέα έρευνας που να βασίζεται στην ανάλυση και στην ερμηνεία του. Δεν υπάρχει καμία αμφιβολία ότι η δημιουργία του συναισθήματος (affect production) επηρεάζεται από το εκάστοτε πλαίσιο που λαμβάνει χώρα τη δεδομένη στιγμή, όπως το έργο στο οποίο υποβάλλεται ο χρήστης, τα άτομα που αλληλεπιδρούν με το χρήστη, η ταυτότητα αλλά και η εκφραστικότητά τους. Η οποιαδήποτε λοιπόν συμπληρωματική μορφή πληροφορίας πλαισίου αναφορικά με τον εξεταζόμενο τομέα έρευνας μας βοηθά ώστε να απαντήσουμε στο ερώτημα: τί είναι πιθανότερο να συμβεί, εκτρέποντας έτσι τον ταξινομητή από τις πιθανότερες/σχετικές κατηγορίες. Χωρίς το πλαίσιο, ακόμη και οι άνθρωποι μπορεί να παρερμηνεύουν τις παρατηρούμενες εκφράσεις του. Έτσι, με την αντιμετώπιση των προκλήσεων υπό το πρίσμα της αναγνώρισης του συναισθήματος υπό συγκεκριμένο πλαίσιο (context-aware affect analysis), δηλαδή με την καλύτερη μελέτη των πληροφοριών πλαισίου, με την ερμηνεία του σε συγκεκριμένους τομείς εφαρμογών, την αναπαράστασή του, τη μοντελοποίησή του, μπορούμε να προσεγγίσουμε καλύτερα την αναγνώριση του συναισθήματος σε πραγματικό χρόνο. Αντίστοιχα, στον τομέα των προσωπικών ερμηνειών από το κείμενο (Sentiment Analysis) αλλά και γενικότερα στον τομέα της Φυσικής Γλώσσας (Natural Language Processing (NLP)) η συνεισφορά του πλαισίου έγκειται στην καλύτερη αναγνώριση, ερμηνεία και επεξεργασία των απόψεων (opinions) και συναισθημάτων (sentiments) σε κείμενα, τα οποία εξετάζονται σε επίπεδο κειμένου (document-level), προτάσεων sentence-level και χαρακτηριστικών (aspect-level) αντίστοιχα. Στην περίπτωση αυτή, λαμβάνονται υπόψιν η σημασιολογία, οι γνωστικές και οι συναισθηματικές πληροφορίες των υποκειμενικών απαντήσεων των ατόμων. Ειδικότερα, στον τομέα αυτό, η συνεισφορά μας έγκειται στην εκπαίδευση ισχυρών αναπαραστάσεις χαρακτηριστικών από μη επισημειωμένα δεδομένα με τη χρήση Νευρωνικών Δικτύων και συγκεκριμένα με τη χρήση Ανταγωνιστικά Παραγωγικών Μοντέλων (GANs), η χρήση των οποίων έχει επιδείξει εντυπωσιακά αποτελέσματα στον τομέα της Όρασης Υπολογιστών. Η πρωτοτυπία της συγκεριμένης μεθόδου έγκειται στον τρόπο υλοποίησης του μοντέλου, στην επιλογή των υπερπαραμετρων, στη χρήση μη επιβλεπόμενης μάθησης και στην πειραματική επικύρωση του προτεινόμενου μοντέλου σε σώματα κειμένου που προέρχονται από διαφορετικές πηγές αναφορικά με το είδος τους και την έκτασή τους.

Download Full-text

Natural Language Processing of Social Media as Screening for Suicide Risk

Biomedical Informatics Insights ◽

10.1177/1178222618792860 ◽

2018 ◽

Vol 10 ◽

pp. 117822261879286 ◽

Cited By ~ 51

Author(s):

Glen Coppersmith ◽

Ryan Leary ◽

Patrick Crutchley ◽

Alex Fine

Keyword(s):

At Risk ◽

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Suicide Risk ◽

Automated System ◽

World Health ◽

Primary Care Doctor ◽

Trade Off

Suicide is among the 10 most common causes of death, as assessed by the World Health Organization. For every death by suicide, an estimated 138 people’s lives are meaningfully affected, and almost any other statistic around suicide deaths is equally alarming. The pervasiveness of social media—and the near-ubiquity of mobile devices used to access social media networks—offers new types of data for understanding the behavior of those who (attempt to) take their own lives and suggests new possibilities for preventive intervention. We demonstrate the feasibility of using social media data to detect those at risk for suicide. Specifically, we use natural language processing and machine learning (specifically deep learning) techniques to detect quantifiable signals around suicide attempts, and describe designs for an automated system for estimating suicide risk, usable by those without specialized mental health training (eg, a primary care doctor). We also discuss the ethical use of such technology and examine privacy implications. Currently, this technology is only used for intervention for individuals who have “opted in” for the analysis and intervention, but the technology enables scalable screening for suicide risk, potentially identifying many people who are at risk preventively and prior to any engagement with a health care system. This raises a significant cultural question about the trade-off between privacy and prevention—we have potentially life-saving technology that is currently reaching only a fraction of the possible people at risk because of respect for their privacy. Is the current trade-off between privacy and prevention the right one?

Download Full-text

A Composite Natural Language Processing and Information Retrieval Approach to Question Answering Using a Structured Knowledge Base

International Journal of Semantic Computing ◽

10.1142/s1793351x17400141 ◽

2017 ◽

Vol 11 (03) ◽

pp. 345-371

Author(s):

Avani Chandurkar ◽

Ajay Bansal

Keyword(s):

Information Retrieval ◽

Natural Language Processing ◽

Natural Language ◽

Knowledge Base ◽

Language Processing ◽

Question Answering ◽

Automated System ◽

Free Form ◽

Question Answering System ◽

Novel Approach

With the inception of the World Wide Web, the amount of data present on the Internet is tremendous. This makes the task of navigating through this enormous amount of data quite difficult for the user. As users struggle to navigate through this wealth of information, the need for the development of an automated system that can extract the required information becomes urgent. This paper presents a Question Answering system to ease the process of information retrieval. Question Answering systems have been around for quite some time and are a sub-field of information retrieval and natural language processing. The task of any Question Answering system is to seek an answer to a free form factual question. The difficulty of pinpointing and verifying the precise answer makes question answering more challenging than simple information retrieval done by search engines. The research objective of this paper is to develop a novel approach to Question Answering based on a composition of conventional approaches of Information Retrieval (IR) and Natural Language processing (NLP). The focus is on using a structured and annotated knowledge base instead of an unstructured one. The knowledge base used here is DBpedia and the final system is evaluated on the Text REtrieval Conference (TREC) 2004 questions dataset.

Download Full-text

Sentence embeddings in NLI with iterative refinement encoders

Natural Language Engineering ◽

10.1017/s1351324919000202 ◽

2019 ◽

Vol 25 (4) ◽

pp. 467-482 ◽

Cited By ~ 3

Author(s):

Aarne Talman ◽

Anssi Yli-Jyrä ◽

Jörg Tiedemann

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Recurrent Neural Networks ◽

State Of The Art ◽

Iterative Refinement ◽

Learning Tasks ◽

Sentence Level ◽

Refinement Strategy

AbstractSentence-level representations are necessary for various natural language processing tasks. Recurrent neural networks have proven to be very effective in learning distributed representations and can be trained efficiently on natural language inference tasks. We build on top of one such model and propose a hierarchy of bidirectional LSTM and max pooling layers that implements an iterative refinement strategy and yields state of the art results on the SciTail dataset as well as strong results for Stanford Natural Language Inference and Multi-Genre Natural Language Inference. We can show that the sentence embeddings learned in this way can be utilized in a wide variety of transfer learning tasks, outperforming InferSent on 7 out of 10 and SkipThought on 8 out of 9 SentEval sentence embedding evaluation tasks. Furthermore, our model beats the InferSent model in 8 out of 10 recently published SentEval probing tasks designed to evaluate sentence embeddings’ ability to capture some of the important linguistic properties of sentences.

Download Full-text

An Analysis of the Applications of Natural Language Processing in Various Sectors

10.3233/apc210109 ◽

2021 ◽

Author(s):

Priya B ◽

Nandhini J.M ◽

Gnanasekaran T

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Health Sector ◽

Automated System ◽

Human Language ◽

Agriculture Sector ◽

Prediction Of Diabetes

Natural Language processing (NLP) dealing with Artificial Intelligence concept is a subfield of Computer Science, enabling computers to understand and process human language. Natural Language Processing being a part of artificial intelligence provides understanding of human language by computers for the purpose of extracting information or insights and create meaningful response. It involves creating algorithms that transform text in to words labeling With the emerging advancements in Machine learning and Deep Learning, NLP can contributed a lot towards health sector, education, agriculture and so on. This paper summarizes the various aspects of NLP along with case studies associated with Health Sector for Voice Automated System, prediction of Diabetes Millets, Crop Detection technique in Agriculture Sector.

Download Full-text

Completely Automated Captcha Solver

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.36710 ◽

2021 ◽

Vol 9 (VII) ◽

pp. 1728-1732

Author(s):

Prof. P. Y. Pawar

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Success Rate ◽

Language Processing ◽

Turing Test ◽

Automated System ◽

High Success Rate ◽

First Line

This project was primarily aimed to create an automated system for solving captcha’s automatically. CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Human Apart) are the Internet’s first line of defence against automated account creation and service abuse. This paper presents unCaptcha, an automates system that can solve Captcha’s most difficult auditory challenges with high success rate using Deep Learning and Natural Language processing. There are four types of Captcha’s Audio Captcha,Text based captcha, Image captcha,Maths-solver captcha.

Download Full-text

Automated document-level classification of surveillance and diagnostic colonoscopy for Inflammatory Bowel Disease: An application of natural language processing

Inflammatory Bowel Diseases ◽

10.1097/00054725-201112002-00116 ◽

2011 ◽

Vol 17 ◽

pp. S37-S38

Author(s):

Jason Hou ◽

Mimi Chang ◽

Thien Nguyen ◽

Jennifer Kramer ◽

Peter Richardson ◽

...

Keyword(s):

Inflammatory Bowel Disease ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Bowel Disease ◽

Diagnostic Colonoscopy ◽

Inflammatory Bowel ◽

Document Level

Download Full-text

Prediction of Stroke Outcome Using Natural Language Processing-Based Machine Learning of Radiology Report of Brain MRI

Journal of Personalized Medicine ◽

10.3390/jpm10040286 ◽

2020 ◽

Vol 10 (4) ◽

pp. 286

Author(s):

Tak Sung Heo ◽

Yu Seop Kim ◽

Jeong Myeong Choi ◽

Yeong Seok Jeong ◽

Soo Young Seo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Brain Mri ◽

Free Text ◽

Word Level ◽

Poor Outcomes ◽

Document Level

Brain magnetic resonance imaging (MRI) is useful for predicting the outcome of patients with acute ischemic stroke (AIS). Although deep learning (DL) using brain MRI with certain image biomarkers has shown satisfactory results in predicting poor outcomes, no study has assessed the usefulness of natural language processing (NLP)-based machine learning (ML) algorithms using brain MRI free-text reports of AIS patients. Therefore, we aimed to assess whether NLP-based ML algorithms using brain MRI text reports could predict poor outcomes in AIS patients. This study included only English text reports of brain MRIs examined during admission of AIS patients. Poor outcome was defined as a modified Rankin Scale score of 3–6, and the data were captured by trained nurses and physicians. We only included MRI text report of the first MRI scan during the admission. The text dataset was randomly divided into a training and test dataset with a 7:3 ratio. Text was vectorized to word, sentence, and document levels. In the word level approach, which did not consider the sequence of words, and the “bag-of-words” model was used to reflect the number of repetitions of text token. The “sent2vec” method was used in the sensation-level approach considering the sequence of words, and the word embedding was used in the document level approach. In addition to conventional ML algorithms, DL algorithms such as the convolutional neural network (CNN), long short-term memory, and multilayer perceptron were used to predict poor outcomes using 5-fold cross-validation and grid search techniques. The performance of each ML classifier was compared with the area under the receiver operating characteristic (AUROC) curve. Among 1840 subjects with AIS, 645 patients (35.1%) had a poor outcome 3 months after the stroke onset. Random forest was the best classifier (0.782 of AUROC) using a word-level approach. Overall, the document-level approach exhibited better performance than did the word- or sentence-level approaches. Among all the ML classifiers, the multi-CNN algorithm demonstrated the best classification performance (0.805), followed by the CNN (0.799) algorithm. When predicting future clinical outcomes using NLP-based ML of radiology free-text reports of brain MRI, DL algorithms showed superior performance over the other ML algorithms. In particular, the prediction of poor outcomes in document-level NLP DL was improved more by multi-CNN and CNN than by recurrent neural network-based algorithms. NLP-based DL algorithms can be used as an important digital marker for unstructured electronic health record data DL prediction.

Download Full-text

Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders

npj Schizophrenia ◽

10.1038/s41537-021-00154-3 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Sunny X. Tang ◽

Reno Kriz ◽

Sunghye Cho ◽

Suh Jung Park ◽

Jenna Harowitz ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Small Sample ◽

First Person ◽

Schizophrenia Spectrum Disorders ◽

Speech Disturbance ◽

Schizophrenia Spectrum ◽

Sentence Level ◽

Spectrum Disorders

AbstractComputerized natural language processing (NLP) allows for objective and sensitive detection of speech disturbance, a hallmark of schizophrenia spectrum disorders (SSD). We explored several methods for characterizing speech changes in SSD (n = 20) compared to healthy control (HC) participants (n = 11) and approached linguistic phenotyping on three levels: individual words, parts-of-speech (POS), and sentence-level coherence. NLP features were compared with a clinical gold standard, the Scale for the Assessment of Thought, Language and Communication (TLC). We utilized Bidirectional Encoder Representations from Transformers (BERT), a state-of-the-art embedding algorithm incorporating bidirectional context. Through the POS approach, we found that SSD used more pronouns but fewer adverbs, adjectives, and determiners (e.g., “the,” “a,”). Analysis of individual word usage was notable for more frequent use of first-person singular pronouns among individuals with SSD and first-person plural pronouns among HC. There was a striking increase in incomplete words among SSD. Sentence-level analysis using BERT reflected increased tangentiality among SSD with greater sentence embedding distances. The SSD sample had low speech disturbance on average and there was no difference in group means for TLC scores. However, NLP measures of language disturbance appear to be sensitive to these subclinical differences and showed greater ability to discriminate between HC and SSD than a model based on clinical ratings alone. These intriguing exploratory results from a small sample prompt further inquiry into NLP methods for characterizing language disturbance in SSD and suggest that NLP measures may yield clinically relevant and informative biomarkers.

Download Full-text