Text Messaging-Based Medical Diagnosis Using Natural Language Processing and Fuzzy Logic

The use of natural language processing (NLP) methods and their application to developing conversational systems for health diagnosis increases patients’ access to medical knowledge. In this study, a chatbot service was developed for the Covenant University Doctor (CUDoctor) telehealth system based on fuzzy logic rules and fuzzy inference. The service focuses on assessing the symptoms of tropical diseases in Nigeria. Telegram Bot Application Programming Interface (API) was used to create the interconnection between the chatbot and the system, while Twilio API was used for interconnectivity between the system and a short messaging service (SMS) subscriber. The service uses the knowledge base consisting of known facts on diseases and symptoms acquired from medical ontologies. A fuzzy support vector machine (SVM) is used to effectively predict the disease based on the symptoms inputted. The inputs of the users are recognized by NLP and are forwarded to the CUDoctor for decision support. Finally, a notification message displaying the end of the diagnosis process is sent to the user. The result is a medical diagnosis system which provides a personalized diagnosis utilizing self-input from users to effectively diagnose diseases. The usability of the developed system was evaluated using the system usability scale (SUS), yielding a mean SUS score of 80.4, which indicates the overall positive evaluation.

Download Full-text

EventEpi–A Natural Language Processing Framework for Event-Based Surveillance

10.1101/19006395 ◽

2019 ◽

Author(s):

Auss Abbood ◽

Alexander Ullrich ◽

Rüdiger Busche ◽

Stéphane Ghozzi

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Web Application ◽

Fine Tuning ◽

Entity Recognition ◽

World Health ◽

Support Vector ◽

Event Based ◽

Processing Framework

AbstractAccording to the World Health Organization (WHO), around 60% of all outbreaks are detected using informal sources. In many public health institutes, including the WHO and the Robert Koch Institute (RKI), dedicated groups of epidemiologists sift through numerous articles and newsletters to detect relevant events. This media screening is one important part of event-based surveillance (EBS). Reading the articles, discussing their relevance, and putting key information into a database is a time-consuming process. To support EBS, but also to gain insights into what makes an article and the event it describes relevant, we developed a natural-language-processing framework for automated information extraction and relevance scoring. First, we scraped relevant sources for EBS as done at RKI (WHO Disease Outbreak News and ProMED) and automatically extracted the articles’ key data: disease, country, date, and confirmed-case count. For this, we performed named entity recognition in two steps: EpiTator, an open-source epidemiological annotation tool, suggested many different possibilities for each. We trained a naive Bayes classifier to find the single most likely one using RKI’s EBS database as labels. Then, for relevance scoring, we defined two classes to which any article might belong: The article is relevant if it is in the EBS database and irrelevant otherwise. We compared the performance of different classifiers, using document and word embeddings. Two of the tested algorithms stood out: The multilayer perceptron performed best overall, with a precision of 0.19, recall of 0.50, specificity of 0.89, F1 of 0.28, and the highest tested index balanced accuracy of 0.46. The support-vector machine, on the other hand, had the highest recall (0.88) which can be of higher interest for epidemiologists. Finally, we integrated these functionalities into a web application called EventEpi where relevant sources are automatically analyzed and put into a database. The user can also provide any URL or text, that will be analyzed in the same way and added to the database. Each of these steps could be improved, in particular with larger labeled datasets and fine-tuning of the learning algorithms. The overall framework, however, works already well and can be used in production, promising improvements in EBS. The source code is publicly available at https://github.com/aauss/EventEpi.

Download Full-text

Fuzzy logic in natural language processing

2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) ◽

10.1109/fuzz-ieee.2017.8015405 ◽

2017 ◽

Cited By ~ 2

Author(s):

Vilem Novak

Keyword(s):

Fuzzy Logic ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing

Download Full-text

NLPReViz: an interactive tool for natural language processing on clinical text

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocx070 ◽

2017 ◽

Vol 25 (1) ◽

pp. 81-87 ◽

Cited By ~ 11

Author(s):

Gaurav Trivedi ◽

Phuong Pham ◽

Wendy W Chapman ◽

Rebecca Hwa ◽

Janyce Wiebe ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

User Study ◽

Clinical Text ◽

Domain Experts ◽

Expert Review ◽

System Usability Scale ◽

Average System ◽

Colonoscopy Quality

Abstract The gap between domain experts and natural language processing expertise is a barrier to extracting understanding from clinical text. We describe a prototype tool for interactive review and revision of natural language processing models of binary concepts extracted from clinical notes. We evaluated our prototype in a user study involving 9 physicians, who used our tool to build and revise models for 2 colonoscopy quality variables. We report changes in performance relative to the quantity of feedback. Using initial training sets as small as 10 documents, expert review led to final F1scores for the “appendiceal-orifice” variable between 0.78 and 0.91 (with improvements ranging from 13.26% to 29.90%). F1for “biopsy” ranged between 0.88 and 0.94 (−1.52% to 11.74% improvements). The average System Usability Scale score was 70.56. Subjective feedback also suggests possible design improvements.

Download Full-text

Implementing WordNet Measures of Lexical Semantic Similarity in a Fuzzy Logic Programming System

Theory and Practice of Logic Programming ◽

10.1017/s1471068421000028 ◽

2021 ◽

pp. 1-19

Author(s):

PASCUAL JULIÁN-IRANZO ◽

FERNANDO SÁENZ-PÉREZ

Keyword(s):

Fuzzy Logic ◽

Natural Language Processing ◽

Natural Language ◽

Logic Programming ◽

Programming Language ◽

Language Processing ◽

Similarity Measures ◽

Programming System ◽

Syntactic Structures ◽

Unification Algorithm

Abstarct This paper introduces techniques to integrate WordNet into a Fuzzy Logic Programming system. Since WordNet relates words but does not give graded information on the relation between them, we have implemented standard similarity measures and new directives allowing the proximity equations linking two words to be generated with an approximation degree. Proximity equations are the key syntactic structures which, in addition to a weak unification algorithm, make a flexible query-answering process possible in this kind of programming language. This addition widens the scope of Fuzzy Logic Programming, allowing certain forms of lexical reasoning, and reinforcing Natural Language Processing (NLP) applications.

Download Full-text

Natural language processing for cognitive therapy: Extracting schemas from thought records

PLoS ONE ◽

10.1371/journal.pone.0257832 ◽

2021 ◽

Vol 16 (10) ◽

pp. e0257832

Author(s):

Franziska Burger ◽

Mark A. Neerincx ◽

Willem-Paul Brinkman

Keyword(s):

Mental Health ◽

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Recurrent Neural Networks ◽

Support Vector ◽

Free Text ◽

Cognitive Approach ◽

Text Input

The cognitive approach to psychotherapy aims to change patients’ maladaptive schemas, that is, overly negative views on themselves, the world, or the future. To obtain awareness of these views, they record their thought processes in situations that caused pathogenic emotional responses. The schemas underlying such thought records have, thus far, been largely manually identified. Using recent advances in natural language processing, we take this one step further by automatically extracting schemas from thought records. To this end, we asked 320 healthy participants on Amazon Mechanical Turk to each complete five thought records consisting of several utterances reflecting cognitive processes. Agreement between two raters on manually scoring the utterances with respect to how much they reflect each schema was substantial (Cohen’s κ = 0.79). Natural language processing software pretrained on all English Wikipedia articles from 2014 (GLoVE embeddings) was used to represent words and utterances, which were then mapped to schemas using k-nearest neighbors algorithms, support vector machines, and recurrent neural networks. For the more frequently occurring schemas, all algorithms were able to leverage linguistic patterns. For example, the scores assigned to the Competence schema by the algorithms correlated with the manually assigned scores with Spearman correlations ranging between 0.64 and 0.76. For six of the nine schemas, a set of recurrent neural networks trained separately for each of the schemas outperformed the other algorithms. We present our results here as a benchmark solution, since we conducted this research to explore the possibility of automatically processing qualitative mental health data and did not aim to achieve optimal performance with any of the explored models. The dataset of 1600 thought records comprising 5747 utterances is published together with this article for researchers and machine learning enthusiasts to improve upon our outcomes. Based on our promising results, we see further opportunities for using free-text input and subsequent natural language processing in other common therapeutic tools, such as ecological momentary assessments, automated case conceptualizations, and, more generally, as an alternative to mental health scales.

Download Full-text

Supplemental materials for preprint: KONTROL PERILAKU AGEN MENGGUNAKAN FUZZY LOGIC BERBASIS SEMANTIK

10.31219/osf.io/nxk72 ◽

2021 ◽

Author(s):

akuwan saleh

Keyword(s):

Fuzzy Logic ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Semantic Mapping ◽

World Knowledge

?atural language merupakan proses pembuatan model komputasi dari bahasa, sehingga dapat terjadi interaksi antara manusia dengan komputer dengan perantaraan bahasa alami. Model komputasi ini dapat berguna untuk keperluan ilmiah seperti meneliti sifat-sifat dari suatu bentuk bahasa alami maupun untuk keperluan sehari-hari. Bidang-bidang pengetahuan yang berhubungan dengan natural language processing meliputi : Fonetik dan fonologi, morfologi, sintaksis, semantik, pragmatik, discourse knowledge, dan world knowledge. Definisi dari semantik yaitu pemetaan bentuk struktur sintaksis dengan memanfaatkan tiap kata ke dalam bentuk yang lebih mendasar dan tidak tergantung struktur kalimat. Semantik mempelajari arti suatu kata dan bagaimana dari arti kata-arti kata tersebut membentuk suatu arti dari kalimat yang utuh. Proses analisa semantik digunakan untuk mengenali kata-kata yang mendahului dan berhubungan dengan kata yang ada dalam domain. Proses ini dilakukan dengan menghubungkan struktur sintak mulai dari kata, frasa, kalimat, hingga paragraf. Dalam penelitian sebelumnya berkaitan dengan semantic mapping, pemetaan semantik dilakukan berdasarkan pada tampilan fisik dan selanjutnya peran dari suatu model/karakter dalam suatu cerita. Inti dari obyek permainan tidak harus dimunculkan dari tampilan fisik dari suatu karakter saja, tetapi juga dapat dihubungkan dengan parameter penting lain seperti pakaian, alat, benda, senjata yang dibawa oleh masing-masing karakternya.

Download Full-text

Klasifikasi Laporan Keluhan Pelayanan Publik Berdasarkan Instansi Menggunakan Metode LDA-SVM

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2021863768 ◽

2021 ◽

Vol 8 (6) ◽

pp. 1265

Author(s):

Muhammad Alkaff ◽

Andreyan Rizky Baskara ◽

Irham Maulani

Keyword(s):

Support Vector Machine ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Latent Dirichlet Allocation ◽

Confusion Matrix ◽

Support Vector ◽

Topic Distribution ◽

The Government ◽

Dirichlet Allocation

Sebuah sistem layanan untuk menyampaikan aspirasi dan keluhan masyarakat terhadap layanan pemerintah Indonesia, bernama Lapor! Pemerintah sudah lama memanfaatkan sistem tersebut untuk menjawab permasalahan masyarakat Indonesia terkait permasalahan birokrasi. Namun, peningkatan volume laporan dan pemilahan laporan yang dilakukan oleh operator dengan membaca setiap keluhan yang masuk melalui sistem menyebabkan sering terjadi kesalahan dimana operator meneruskan laporan tersebut ke instansi yang salah. Oleh karena itu, diperlukan suatu solusi yang dapat menentukan konteks laporan secara otomatis dengan menggunakan teknik Natural Language Processing. Penelitian ini bertujuan untuk membangun klasifikasi laporan secara otomatis berdasarkan topik laporan yang ditujukan kepada instansi yang berwenang dengan menggabungkan metode Latent Dirichlet Allocation (LDA) dan Support Vector Machine (SVM). Proses pemodelan topik untuk setiap laporan dilakukan dengan menggunakan metode LDA. Metode ini mengekstrak laporan untuk menemukan pola tertentu dalam dokumen yang akan menghasilkan keluaran dalam nilai distribusi topik. Selanjutnya, proses klasifikasi untuk menentukan laporan agensi tujuan dilakukan dengan menggunakan SVM berdasarkan nilai topik yang diekstraksi dengan metode LDA. Performa model LDA-SVM diukur dengan menggunakan confusion matrix dengan menghitung nilai akurasi, presisi, recall, dan F1 Score. Hasil pengujian menggunakan teknik split train-test dengan skor 70:30 menunjukkan bahwa model menghasilkan kinerja yang baik dengan akurasi 79,85%, presisi 79,98%, recall 72,37%, dan Skor F1 74,67%. AbstractA service system to convey aspirations and complaints from the public against Indonesia's government services, named Lapor! The Government has used the Government for a long time to answer the problems of the Indonesian people related to bureaucratic problems. However, the increasing volume of reports and the sorting of reports carried out by operators by reading every complaint that comes through the system cause frequent errors where operators forward the reports to the wrong agencies. Therefore, we need a solution that can automatically determine the report's context using Natural Language Processing techniques. This study aims to build automatic report classifications based on report topics addressed to authorized agencies by combining Latent Dirichlet Allocation (LDA) and Support Vector Machine (SVM). The topic-modeling process for each report was carried out using the LDA method. This method extracts reports to find specific patterns in documents that will produce output in topic distribution values. Furthermore, the classification process to determine the report's destination agency carried out using the SVM based on the value of the topics extracted by the LDA method. The LDA-SVM model's performance is measured using a confusion matrix by calculating the value of accuracy, precision, recall, and F1 Score. The test results using the train-test split technique with a 70:30 show that the model produces good performance with 79.85% accuracy, 79.98% precision, 72.37% recall, and 74.67% F1 Score

Download Full-text

Natural Language Processing and Machine Learning Classifier used for Detecting the Author of the Sentence

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4098.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 936-939 ◽

Cited By ~ 6

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Support Vector ◽

Learning Classifier ◽

Number Of Classes

Detecting the author of the sentence in a collective document can be done by choosing a suitable set of features and implementing using Natural Language Processing in Machine Learning. Training our machine is the basic idea to identify the author name of a specific sentence. This can be done by using 8 different NLP steps like applying stemming algorithm, finding stop-list words, preprocessing the data, and then applying it to a machine learning classifier-Support vector machine (SVM) which classify the dataset into a number of classes specifying the author of the sentence and defines the name of author for each and every sentence with an accuracy of 82%.This paper helps the readers who are interested in knowing the names of the authors who have written some specific words

Download Full-text

Multi - Class Document Classification: Effective and Systematized Method to Categorize Documents

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset207117 ◽

2020 ◽

pp. 118-123 ◽

Cited By ~ 1

Author(s):

Kaushika Pal ◽

Biraj V. Patel

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

English Language ◽

Nearest Neighbor ◽

Research Work ◽

Support Vector ◽

Indian Languages ◽

K Nearest Neighbor

A large section of World Wide Web is full of Documents, content; Data, Big data, unformatted data, formatted data, unstructured and unorganized data and we need information infrastructure, which is useful and easily accessible as an when required. This research work is combining approach of Natural Language Processing and Machine Learning for content-based classification of documents. Natural Language Processing is used which will divide the problem of understanding entire document at once into smaller chucks and give us only with useful tokens responsible for Feature Extraction, which is machine learning technique to create Feature Set which helps to train classifier to predict label for new document and place it at appropriate location. Machine Learning subset of Artificial Intelligence is enriched with sophisticated algorithms like Support Vector Machine, K – Nearest Neighbor, Naïve Bayes, which works well with many Indian Languages and Foreign Language content’s for classification. This Model is successful in classifying documents with more than 70% of accuracy for major Indian Languages and more than 80% accuracy for English Language.

Download Full-text

Abstract P225: Improving Prehospital Stroke Identification Using Natural Language Processing of Paramedic Reports

Stroke ◽

10.1161/str.52.suppl_1.p225 ◽

2021 ◽

Vol 52 (Suppl_1) ◽

Author(s):

Anoop Mayampurath ◽

Zahra Parnianpour ◽

William J Meurer ◽

Jungwha Lee ◽

Bruce Ankenman ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Predictive Model ◽

Language Processing ◽

Prehospital Setting ◽

Screening Tools ◽

Superior Performance ◽

Support Vector ◽

Stroke Registry ◽

C Statistic

Introduction: Early identification of stroke by emergency medical services (EMS) providers in the prehospital setting is associated with increased treatment rates, improved functional outcomes, and reduced mortality. We hypothesize that a predictive model utilizing machine learning and natural language processing (NLP) techniques can be developed to analyze EMS run reports to identify stroke patients accurately. Methods: We analyzed EMS data from the Chicago Fire Department matched with inpatient data on confirmed and suspected strokes from 17 Chicago hospitals in the Get With The Guidelines-Stroke (GWTG-Stroke) registry from 11/28/2018 to 5/31/2019. Using features derived from paramedic notes, we developed a support vector machine classifier to predict the following categories: any stroke, AIS-LVO, severe stroke (NIHSS>5), and CSC-eligible stroke (AIS-LVO or ICH/SAH). Individuals were randomly assigned into model derivation (70%) and validation cohorts (30%). C-statistics were used to evaluate discrimination of the classifier for stroke categories. Results: A total of 965 patients were included for analysis. In a validation cohort of 289 patients, the text-based model predicted stroke better than models trained using the Cincinnati Prehospital Stroke Scale (CPSS, c-statistic: 0.73 vs. 0.67, P=0.165) and the 3-Item Stroke Scale (3I-SS, c-statistic: 0.73 vs. 0.53, P <0.001) scores. The text-based model also demonstrated improved performance over the CPSS and 3I-SS models in discriminating patients with other stroke categories (Table 1). Conclusion: We derived a predictive model using clinical text from paramedic reports that has superior performance to existing prehospital clinical screening tools to identify stroke in the prehospital setting. Future studies can evaluate the implementation of an NLP-based decision tool to assist in prehospital stroke evaluation and destination decision-making.

Download Full-text