Natural Language–based Machine Learning Models for the                     Annotation of Clinical Radiology Reports

BACKGROUND The COVID-19 pandemic has created a pressing need for integrating information from disparate sources, in order to assist decision makers. Social media is important in this respect, however, to make sense of the textual information it provides and be able to automate the processing of large amounts of data, natural language processing methods are needed. Social media posts are often noisy, yet they may provide valuable insights regarding the severity and prevalence of the disease in the population. In particular, machine learning techniques for triage and diagnosis could allow for a better understanding of what social media may offer in this respect. OBJECTIVE This study aims to develop an end-to-end natural language processing pipeline for triage and diagnosis of COVID-19 from patient-authored social media posts, in order to provide researchers and other interested parties with additional information on the symptoms, severity and prevalence of the disease. METHODS The text processing pipeline first extracts COVID-19 symptoms and related concepts such as severity, duration, negations, and body parts from patients’ posts using conditional random fields. An unsupervised rule-based algorithm is then applied to establish relations between concepts in the next step of the pipeline. The extracted concepts and relations are subsequently used to construct two different vector representations of each post. These vectors are applied separately to build support vector machine learning models to triage patients into three categories and diagnose them for COVID-19. RESULTS We report that Macro- and Micro-averaged F_{1\ }scores in the range of 71-96% and 61-87%, respectively, for the triage and diagnosis of COVID-19, when the models are trained on human labelled data. Our experimental results indicate that similar performance can be achieved when the models are trained using predicted labels from concept extraction and rule-based classifiers, thus yielding end-to-end machine learning. Also, we highlight important features uncovered by our diagnostic machine learning models and compare them with the most frequent symptoms revealed in another COVID-19 dataset. In particular, we found that the most important features are not always the most frequent ones. CONCLUSIONS Our preliminary results show that it is possible to automatically triage and diagnose patients for COVID-19 from natural language narratives using a machine learning pipeline, in order to provide additional information on the severity and prevalence of the disease through the eyes of social media.

Download Full-text

On the Influence of Contextual Features for the Identification of Complex Words

International Journal of Semantic Computing ◽

10.1142/s1793351x17400207 ◽

2017 ◽

Vol 11 (04) ◽

pp. 497-511

Author(s):

Elnaz Davoodi ◽

Leila Kosseim ◽

Matthew Mongrain

Keyword(s):

Machine Learning ◽

Natural Language ◽

Target Word ◽

Supervised Machine Learning ◽

Learning Models ◽

Data Set ◽

Contextual Features ◽

Complex Words ◽

Machine Learning Models

This paper evaluates the effect of the context of a target word on the identification of complex words in natural language texts. The approach automatically tags words as either complex or not, based on two sets of features: base features that only pertain to the target word, and contextual features that take the context of the target word into account. We experimented with several supervised machine learning models, and trained and tested the approach with the 2016 SemEval Word Complexity Data Set. Results show that when discriminating base features are used, the words around the target word can supplement those features and improve the recognition of complex words.

Download Full-text

Estimating Nonfatal Gunshot Injury Locations With Natural Language Processing and Machine Learning Models

JAMA Network Open ◽

10.1001/jamanetworkopen.2020.20664 ◽

2020 ◽

Vol 3 (10) ◽

pp. e2020664

Author(s):

Susan T. Parker

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Gunshot Injury ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Comparison of Various Machine Learning Models to Identify Peripherally Inserted Central Catheter (PICC) Tip Position from Radiology Reports

10.1542/peds.147.3_meetingabstract.6-a ◽

2021 ◽

Author(s):

Manan Shah ◽

Kevin Dufendach ◽

Andrew Schapiro ◽

Yizhao Ni ◽

Surya Prasath

Keyword(s):

Machine Learning ◽

Peripherally Inserted Central Catheter ◽

Central Catheter ◽

Learning Models ◽

Radiology Reports ◽

Tip Position ◽

Machine Learning Models

Download Full-text

Standardization of Featureless Variables for Machine Learning Models Using Natural Language Processing

Lecture Notes in Computer Science - Computational Science – ICCS 2018 ◽

10.1007/978-3-319-93701-4_18 ◽

2018 ◽

pp. 234-246

Author(s):

Kourosh Modarresi ◽

Abdurrahman Munir

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Citation Classification Using Natural Language Processing and Machine Learning Models

Advances in Smart Technologies Applications and Case Studies - Lecture Notes in Electrical Engineering ◽

10.1007/978-3-030-53187-4_39 ◽

2020 ◽

pp. 357-365

Author(s):

Syyab Rahi ◽

Iqra Safder ◽

Sehrish Iqbal ◽

Saeed-Ul Hassan ◽

Iain Reid ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Key Technology Considerations in Developing and Deploying Machine Learning Models in Clinical Radiology Practice (Preprint)

JMIR Medical Informatics ◽

10.2196/28776 ◽

2021 ◽

Author(s):

Viraj Kulkarni ◽

Manish Gawali ◽

Amit Kharat

Keyword(s):

Machine Learning ◽

Learning Models ◽

Key Technology ◽

Radiology Practice ◽

Clinical Radiology ◽

Machine Learning Models

Download Full-text

Expanding WordNet with Gloss and Polysemy Links for Evocation Strength Recognition

Cognitive Studies | Études cognitives ◽

10.11649/cs.2325 ◽

2020 ◽

Author(s):

Marek Maziarz ◽

Ewa Rudnicka

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Semantic Relations ◽

New Method ◽

Learning Models ◽

Dijkstra's Algorithm ◽

Vector Representations ◽

Machine Learning Models

Expanding WordNet with Gloss and Polysemy Links for Evocation Strength RecognitionEvocation – a phenomenon of sense associations going beyond standard (lexico)-semantic relations – is difficult to recognise for natural language processing systems. Machine learning models give predictions which are only moderately correlated with the evocation strength. It is believed that ordinary graph measures are not as good at this task as methods based on vector representations. The paper proposes a new method of enriching the WordNet structure with weighted polysemy and gloss links, and proves that Dijkstra’s algorithm performs equally as well as other more sophisticated measures when set together with such expanded structures. Rozszerzenie WordNetu o glosy i relacje polisemiczne na potrzeby rozpoznawania siły ewokacjiEwokacja – zjawisko skojarzeń zmysłowych wykraczających poza standardowe (leksykalne) relacje semantyczne – jest trudne do rozpoznania dla systemów przetwarzania języka naturalnego. Modele uczenia maszynowego dają prognozy tylko umiarkowanie skorelowane z siłą ewokacji. Uważa się, że zwykłe miary grafowe nie są tak dobre w tym zadaniu, jak metody oparte na reprezentacjach wektorowych. Proponujemy nową metodę wzbogacania struktury WordNet o polisemie ważone i linki połysku i udowadniamy, że algorytm Dijkstry zestawiony z tak rozbudowanymi strukturami działa a także inne, bardziej wyrafinowane środki.

Download Full-text

Improving XGBoost with Imagination Sampling

Communications of the Blyth Institute ◽

10.33014/issn.2640-5652.2.1.holloway.1 ◽

2020 ◽

Vol 2 (1) ◽

pp. 3-6

Author(s):

Eric Holloway

Keyword(s):

Machine Learning ◽

General System ◽

Learning Models ◽

Starting Point ◽

Machine Learning Models

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.

Download Full-text

Development of Machine Learning Models to Predict Student Performance in Computer Literacy Courses

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v13i1.16863 ◽

2018 ◽

Vol 13 (1) ◽

pp. 21

Author(s):

George Anderson ◽

Oduronke T. Eyitayo

Keyword(s):

Machine Learning ◽

Student Performance ◽

Computer Literacy ◽

Learning Models ◽

Machine Learning Models

Download Full-text