scholarly journals Machine learning application for the prediction of SARS-CoV-2 infection using blood tests and chest radiograph

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Richard Du ◽  
Efstratios D. Tsougenis ◽  
Joshua W. K. Ho ◽  
Joyce K. Y. Chan ◽  
Keith W. H. Chiu ◽  
...  

AbstractTriaging and prioritising patients for RT-PCR test had been essential in the management of COVID-19 in resource-scarce countries. In this study, we applied machine learning (ML) to the task of detection of SARS-CoV-2 infection using basic laboratory markers. We performed the statistical analysis and trained an ML model on a retrospective cohort of 5148 patients from 24 hospitals in Hong Kong to classify COVID-19 and other aetiology of pneumonia. We validated the model on three temporal validation sets from different waves of infection in Hong Kong. For predicting SARS-CoV-2 infection, the ML model achieved high AUCs and specificity but low sensitivity in all three validation sets (AUC: 89.9–95.8%; Sensitivity: 55.5–77.8%; Specificity: 91.5–98.3%). When used in adjunction with radiologist interpretations of chest radiographs, the sensitivity was over 90% while keeping moderate specificity. Our study showed that machine learning model based on readily available laboratory markers could achieve high accuracy in predicting SARS-CoV-2 infection.

2020 ◽  
Author(s):  
Thomas Tschoellitsch ◽  
Martin Dünser ◽  
Carl Böck ◽  
Karin Schwarzbauer ◽  
Jens Meier

Abstract Objective The diagnosis of COVID-19 is based on the detection of SARS-CoV-2 in respiratory secretions, blood, or stool. Currently, reverse transcription polymerase chain reaction (RT-PCR) is the most commonly used method to test for SARS-CoV-2. Methods In this retrospective cohort analysis, we evaluated whether machine learning could exclude SARS-CoV-2 infection using routinely available laboratory values. A Random Forests algorithm with 1353 unique features was trained to predict the RT-PCR results. Results Out of 12,848 patients undergoing SARS-CoV-2 testing, routine blood tests were simultaneously performed in 1528 patients. The machine learning model could predict SARS-CoV-2 test results with an accuracy of 86% and an area under the receiver operating characteristic curve of 0.90. Conclusion Machine learning methods can reliably predict a negative SARS-CoV-2 RT-PCR test result using standard blood tests.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Matjaž Kukar ◽  
Gregor Gunčar ◽  
Tomaž Vovko ◽  
Simon Podnar ◽  
Peter Černelč ◽  
...  

AbstractPhysicians taking care of patients with COVID-19 have described different changes in routine blood parameters. However, these changes hinder them from performing COVID-19 diagnoses. We constructed a machine learning model for COVID-19 diagnosis that was based and cross-validated on the routine blood tests of 5333 patients with various bacterial and viral infections, and 160 COVID-19-positive patients. We selected the operational ROC point at a sensitivity of 81.9% and a specificity of 97.9%. The cross-validated AUC was 0.97. The five most useful routine blood parameters for COVID-19 diagnosis according to the feature importance scoring of the XGBoost algorithm were: MCHC, eosinophil count, albumin, INR, and prothrombin activity percentage. t-SNE visualization showed that the blood parameters of the patients with a severe COVID-19 course are more like the parameters of a bacterial than a viral infection. The reported diagnostic accuracy is at least comparable and probably complementary to RT-PCR and chest CT studies. Patients with fever, cough, myalgia, and other symptoms can now have initial routine blood tests assessed by our diagnostic tool. All patients with a positive COVID-19 prediction would then undergo standard RT-PCR studies to confirm the diagnosis. We believe that our results represent a significant contribution to improvements in COVID-19 diagnosis.


Author(s):  
Deepali R Deshpande ◽  
Raj L Shah ◽  
Anish N Shaha

The motive behind the project is to build a machine learning model for detection of Covid-19. Using this model, it is possible to classify images of chest x-rays into normal patients, pneumatic patients, and covid-19 positive patients. This CNN based model will help drastically to save time constraints among the patients. Instead of relying on limited RT-PCR kits, just a simple chest x-ray can help us determine health of the patient. Not only we get immediate results, but we can also practice social distancing norms more effectively.


2020 ◽  
Author(s):  
Timothy B Plante ◽  
Aaron M Blau ◽  
Adrian N Berg ◽  
Aaron S Weinberg ◽  
Ik C Jun ◽  
...  

BACKGROUND Conventional diagnosis of COVID-19 with reverse transcription polymerase chain reaction (RT-PCR) testing (hereafter, PCR) is associated with prolonged time to diagnosis and significant costs to run the test. The SARS-CoV-2 virus might lead to characteristic patterns in the results of widely available, routine blood tests that could be identified with machine learning methodologies. Machine learning modalities integrating findings from these common laboratory test results might accelerate ruling out COVID-19 in emergency department patients. OBJECTIVE We sought to develop (ie, train and internally validate with cross-validation techniques) and externally validate a machine learning model to rule out COVID 19 using only routine blood tests among adults in emergency departments. METHODS Using clinical data from emergency departments (EDs) from 66 US hospitals before the pandemic (before the end of December 2019) or during the pandemic (March-July 2020), we included patients aged ≥20 years in the study time frame. We excluded those with missing laboratory results. Model training used 2183 PCR-confirmed cases from 43 hospitals during the pandemic; negative controls were 10,000 prepandemic patients from the same hospitals. External validation used 23 hospitals with 1020 PCR-confirmed cases and 171,734 prepandemic negative controls. The main outcome was COVID 19 status predicted using same-day routine laboratory results. Model performance was assessed with area under the receiver operating characteristic (AUROC) curve as well as sensitivity, specificity, and negative predictive value (NPV). RESULTS Of 192,779 patients included in the training, external validation, and sensitivity data sets (median age decile 50 [IQR 30-60] years, 40.5% male [78,249/192,779]), AUROC for training and external validation was 0.91 (95% CI 0.90-0.92). Using a risk score cutoff of 1.0 (out of 100) in the external validation data set, the model achieved sensitivity of 95.9% and specificity of 41.7%; with a cutoff of 2.0, sensitivity was 92.6% and specificity was 59.9%. At the cutoff of 2.0, the NPVs at a prevalence of 1%, 10%, and 20% were 99.9%, 98.6%, and 97%, respectively. CONCLUSIONS A machine learning model developed with multicenter clinical data integrating commonly collected ED laboratory data demonstrated high rule-out accuracy for COVID-19 status, and might inform selective use of PCR-based testing.


2020 ◽  
Vol 11 ◽  
Author(s):  
Chongying Wang ◽  
Hong Zhao ◽  
Haoran Zhang

The COVID-19 pandemic has caused tremendous loss starting from early this year. This article aims to investigate the change of anxiety severity and prevalence among non-graduating undergraduate students in the new semester of online learning during COVID-19 in China and also to evaluate a machine learning model based on the XGBoost model. A total of 1172 non-graduating undergraduate students aged between 18 and 22 from 34 provincial-level administrative units and 260 cities in China were enrolled onto this study and asked to fill in a sociodemographic questionnaire and the Self-Rating Anxiety Scale (SAS) twice, respectively, during February 15 to 17, 2020, before the new semester started, and March 15 to 17, 2020, 1 month after the new semester based on online learning had started. SPSS 22.0 was used to conduct t-test and single factor analysis. XGBoost models were implemented to predict the anxiety level of students 1 month after the start of the new semester. There were 184 (15.7%, Mean = 58.45, SD = 7.81) and 221 (18.86%, Mean = 57.68, SD = 7.58) students who met the cut-off of 50 and were screened as positive for anxiety, respectively, in the two investigations. The mean SAS scores in the second test was significantly higher than those in the first test (P < 0.05). Significant differences were also found among all males, females, and students majoring in arts and sciences between the two studies (P < 0.05). The results also showed students from Hubei province, where most cases of COVID-19 were confirmed, had a higher percentage of participants meeting the cut-off of being anxious. This article applied machine learning to establish XGBoost models to successfully predict the anxiety level and changes of anxiety levels 4 weeks later based on the SAS scores of the students in the first test. It was concluded that, during COVID-19, Chinese non-graduating undergraduate students showed higher anxiety in the new semester based on online learning than before the new semester started. More students from Hubei province had a different level of anxiety than other provinces. Families, universities, and society as a whole should pay attention to the psychological health of non-graduating undergraduate students and take measures accordingly. It also confirmed that the XGBoost model had better prediction accuracy compared to the traditional multiple stepwise regression model on the anxiety status of university students.


2020 ◽  
Vol 2 (1) ◽  
pp. 17-31
Author(s):  
Szde Yu

The present study compared three methods aimed at predicting the writer's gender based on writing features manifested in electronic discourse. The compared methods included qualitative content analysis, statistical analysis, and machine learning. These methods were further combined to create a mixed methods model. The findings showed that the machine learning model combined with qualitative content analysis produced the best prediction accuracy. Including qualitative content analysis was able to improve accuracy rates even when the training set for machine learning was relatively small. Thus, this study presented a concise model that can be fairly reliable in predicting gender based on electronic discourse with high accuracy rates and such accuracy was consistently found when the model was tested by two separate samples.


2007 ◽  
Vol 01 (04) ◽  
pp. 441-457 ◽  
Author(s):  
STEVEN BETHARD ◽  
JAMES H. MARTIN ◽  
SARA KLINGENSTEIN

This research proposes and evaluates a linguistically motivated approach to extracting temporal structure from text. Pairs of events in a verb-clause construction were considered, where the first event is a verb and the second event is the head of a clausal argument to that verb. All pairs of events in the TimeBank that participated in verb-clause constructions were selected and annotated with the labels BEFORE, OVERLAP and AFTER. The resulting corpus of 895 event-event temporal relations was then used to train a machine learning model. Using a combination of event-level features like tense and aspect with syntax-level features like the paths through the syntactic tree, support vector machine (SVM) models were trained which could identify new temporal relations with 89.2% accuracy. High accuracy models like these are a first step towards automatic extraction of temporal structure from text.


Micromachines ◽  
2020 ◽  
Vol 11 (12) ◽  
pp. 1084
Author(s):  
Shaobo Luo ◽  
Yi Zhang ◽  
Kim Truc Nguyen ◽  
Shilun Feng ◽  
Yuzhi Shi ◽  
...  

High accuracy measurement of size is essential in physical and biomedical sciences. Various sizing techniques have been widely used in sorting colloidal materials, analyzing bioparticles and monitoring the qualities of food and atmosphere. Most imaging-free methods such as light scattering measure the averaged size of particles and have difficulties in determining non-spherical particles. Imaging acquisition using camera is capable of observing individual nanoparticles in real time, but the accuracy is compromised by the image defocusing and instrumental calibration. In this work, a machine learning-based pipeline is developed to facilitate a high accuracy imaging-based particle sizing. The pipeline consists of an image segmentation module for cell identification and a machine learning model for accurate pixel-to-size conversion. The results manifest a significantly improved accuracy, showing great potential for a wide range of applications in environmental sensing, biomedical diagnostical, and material characterization.


2020 ◽  
Vol 21 (18) ◽  
pp. 6914
Author(s):  
Chin-Hsien Lin ◽  
Shu-I Chiu ◽  
Ta-Fu Chen ◽  
Jyh-Shing Roger Jang ◽  
Ming-Jang Chiu

Easily accessible biomarkers for Alzheimer’s disease (AD), Parkinson’s disease (PD), frontotemporal dementia (FTD), and related neurodegenerative disorders are urgently needed in an aging society to assist early-stage diagnoses. In this study, we aimed to develop machine learning algorithms using the multiplex blood-based biomarkers to identify patients with different neurodegenerative diseases. Plasma samples (n = 377) were obtained from healthy controls, patients with AD spectrum (including mild cognitive impairment (MCI)), PD spectrum with variable cognitive severity (including PD with dementia (PDD)), and FTD. We measured plasma levels of amyloid-beta 42 (Aβ42), Aβ40, total Tau, p-Tau181, and α-synuclein using an immunomagnetic reduction-based immunoassay. We observed increased levels of all biomarkers except Aβ40 in the AD group when compared to the MCI and controls. The plasma α-synuclein levels increased in PDD when compared to PD with normal cognition. We applied machine learning-based frameworks, including a linear discriminant analysis (LDA), for feature extraction and several classifiers, using features from these blood-based biomarkers to classify these neurodegenerative disorders. We found that the random forest (RF) was the best classifier to separate different dementia syndromes. Using RF, the established LDA model had an average accuracy of 76% when classifying AD, PD spectrum, and FTD. Moreover, we found 83% and 63% accuracies when differentiating the individual disease severity of subgroups in the AD and PD spectrum, respectively. The developed LDA model with the RF classifier can assist clinicians in distinguishing variable neurodegenerative disorders.


Sign in / Sign up

Export Citation Format

Share Document