scholarly journals Prediction of COVID-19 Severity Using Chest Computed Tomography and Laboratory Measurements: Evaluation Using a Machine Learning Approach (Preprint)

2020 ◽  
Author(s):  
Daowei Li ◽  
Qiang Zhang ◽  
Yue Tan ◽  
Xinghuo Feng ◽  
Yuanyi Yue ◽  
...  

BACKGROUND Most of the mortality resulting from COVID-19 has been associated with severe disease. Effective treatment of severe cases remains a challenge due to the lack of early detection of the infection. OBJECTIVE This study aimed to develop an effective prediction model for COVID-19 severity by combining radiological outcome with clinical biochemical indexes. METHODS A total of 46 patients with COVID-19 (10 severe, 36 nonsevere) were examined. To build the prediction model, a set of 27 severe and 151 nonsevere clinical laboratory records and computerized tomography (CT) records were collected from these patients. We managed to extract specific features from the patients’ CT images by using a recently published convolutional neural network. We also trained a machine learning model combining these features with clinical laboratory results. RESULTS We present a prediction model combining patients’ radiological outcomes with their clinical biochemical indexes to identify severe COVID-19 cases. The prediction model yielded a cross-validated area under the receiver operating characteristic (AUROC) score of 0.93 and an F<sub>1</sub> score of 0.89, which showed a 6% and 15% improvement, respectively, compared to the models based on laboratory test features only. In addition, we developed a statistical model for forecasting COVID-19 severity based on the results of patients’ laboratory tests performed before they were classified as severe cases; this model yielded an AUROC score of 0.81. CONCLUSIONS To our knowledge, this is the first report predicting the clinical progression of COVID-19, as well as forecasting severity, based on a combined analysis using laboratory tests and CT images.

10.2196/21604 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e21604
Author(s):  
Daowei Li ◽  
Qiang Zhang ◽  
Yue Tan ◽  
Xinghuo Feng ◽  
Yuanyi Yue ◽  
...  

Background Most of the mortality resulting from COVID-19 has been associated with severe disease. Effective treatment of severe cases remains a challenge due to the lack of early detection of the infection. Objective This study aimed to develop an effective prediction model for COVID-19 severity by combining radiological outcome with clinical biochemical indexes. Methods A total of 46 patients with COVID-19 (10 severe, 36 nonsevere) were examined. To build the prediction model, a set of 27 severe and 151 nonsevere clinical laboratory records and computerized tomography (CT) records were collected from these patients. We managed to extract specific features from the patients’ CT images by using a recently published convolutional neural network. We also trained a machine learning model combining these features with clinical laboratory results. Results We present a prediction model combining patients’ radiological outcomes with their clinical biochemical indexes to identify severe COVID-19 cases. The prediction model yielded a cross-validated area under the receiver operating characteristic (AUROC) score of 0.93 and an F1 score of 0.89, which showed a 6% and 15% improvement, respectively, compared to the models based on laboratory test features only. In addition, we developed a statistical model for forecasting COVID-19 severity based on the results of patients’ laboratory tests performed before they were classified as severe cases; this model yielded an AUROC score of 0.81. Conclusions To our knowledge, this is the first report predicting the clinical progression of COVID-19, as well as forecasting severity, based on a combined analysis using laboratory tests and CT images.


Author(s):  
Muhammad Younus ◽  
Md Tahsir Ahmed Munna ◽  
Mirza Mohtashim Alam ◽  
Shaikh Muhammad Allayear ◽  
Sheikh Joly Ferdous Ara

2017 ◽  
Author(s):  
Aymen A. Elfiky ◽  
Maximilian J. Pany ◽  
Ravi B. Parikh ◽  
Ziad Obermeyer

ABSTRACTBackgroundCancer patients who die soon after starting chemotherapy incur costs of treatment without benefits. Accurately predicting mortality risk from chemotherapy is important, but few patient data-driven tools exist. We sought to create and validate a machine learning model predicting mortality for patients starting new chemotherapy.MethodsWe obtained electronic health records for patients treated at a large cancer center (26,946 patients; 51,774 new regimens) over 2004-14, linked to Social Security data for date of death. The model was derived using 2004-11 data, and performance measured on non-overlapping 2012-14 data.Findings30-day mortality from chemotherapy start was 2.1%. Common cancers included breast (21.1%), colorectal (19.3%), and lung (18.0%). Model predictions were accurate for all patients (AUC 0.94). Predictions for patients starting palliative chemotherapy (46.6% of regimens), for whom prognosis is particularly important, remained highly accurate (AUC 0.92). To illustrate model discrimination, we ranked patients initiating palliative chemotherapy by model-predicted mortality risk, and calculated observed mortality by risk decile. 30-day mortality in the highest-risk decile was 22.6%; in the lowest-risk decile, no patients died. Predictions remained accurate across all primary cancers, stages, and chemotherapies—even for clinical trial regimens that first appeared in years after the model was trained (AUC 0.94). The model also performed well for prediction of 180-day mortality (AUC 0.87; mortality 74.8% in the highest risk decile vs. 0.2% in the lowest). Predictions were more accurate than data from randomized trials of individual chemotherapies, or SEER estimates.InterpretationA machine learning algorithm accurately predicted short-term mortality in patients starting chemotherapy using EHR data. Further research is necessary to determine generalizability and the feasibility of applying this algorithm in clinical settings.


2020 ◽  
Vol 31 (6) ◽  
pp. 1348-1357 ◽  
Author(s):  
Ibrahim Sandokji ◽  
Yu Yamamoto ◽  
Aditya Biswas ◽  
Tanima Arora ◽  
Ugochukwu Ugwuowo ◽  
...  

BackgroundTimely prediction of AKI in children can allow for targeted interventions, but the wealth of data in the electronic health record poses unique modeling challenges.MethodsWe retrospectively reviewed the electronic medical records of all children younger than 18 years old who had at least two creatinine values measured during a hospital admission from January 2014 through January 2018. We divided the study population into derivation, and internal and external validation cohorts, and used five feature selection techniques to select 10 of 720 potentially predictive variables from the electronic health records. Model performance was assessed by the area under the receiver operating characteristic curve in the validation cohorts. The primary outcome was development of AKI (per the Kidney Disease Improving Global Outcomes creatinine definition) within a moving 48-hour window. Secondary outcomes included severe AKI (stage 2 or 3), inpatient mortality, and length of stay.ResultsAmong 8473 encounters studied, AKI occurred in 516 (10.2%), 207 (9%), and 27 (2.5%) encounters in the derivation, and internal and external validation cohorts, respectively. The highest-performing model used a machine learning-based genetic algorithm, with an overall receiver operating characteristic curve in the internal validation cohort of 0.76 [95% confidence interval (CI), 0.72 to 0.79] for AKI, 0.79 (95% CI, 0.74 to 0.83) for severe AKI, and 0.81 (95% CI, 0.77 to 0.86) for neonatal AKI. To translate this prediction model into a clinical risk-stratification tool, we identified high- and low-risk threshold points.ConclusionsUsing various machine learning algorithms, we identified and validated a time-updated prediction model of ten readily available electronic health record variables to accurately predict imminent AKI in hospitalized children.


2019 ◽  
Vol 34 (4) ◽  
pp. 221-229 ◽  
Author(s):  
Carlo M. Bertoncelli ◽  
Paola Altamura ◽  
Edgar Ramos Vieira ◽  
Domenico Bertoncelli ◽  
Susanne Thummler ◽  
...  

Background: Intellectual disability and impaired adaptive functioning are common in children with cerebral palsy, but there is a lack of studies assessing these issues in teenagers with cerebral palsy. Therefore, the aim of this study was to develop and test a predictive machine learning model to identify factors associated with intellectual disability in teenagers with cerebral palsy. Methods: This was a multicenter controlled cohort study of 91 teenagers with cerebral palsy (53 males, 38 females; mean age ± SD = 17 ± 1 y; range: 12-18 y). Data on etiology, diagnosis, spasticity, epilepsy, clinical history, communication abilities, behaviors, motor skills, eating, and drinking abilities were collected between 2005 and 2015. Intellectual disability was classified as “mild,” “moderate,” “severe,” or “profound” based on adaptive functioning, and according to the DSM-5 after 2013 and DSM-IV before 2013, the Wechsler Intelligence Scale for Children for patients up to ages 16 years, 11 months, and the Wechsler Adult Intelligence Scale for patients ages 17-18. Statistical analysis included Fisher’s exact test and multiple logistic regressions to identify factors associated with intellectual disability. A predictive machine learning model was developed to identify factors associated with having profound intellectual disability. The guidelines of the “Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis Statement” were followed. Results: Poor manual abilities ( P ≤ .001), gross motor function ( P ≤ .001), and type of epilepsy (intractable: P = .04; well controlled: P = .01) were significantly associated with profound intellectual disability. The average model accuracy, specificity, and sensitivity was 78%. Conclusion: Poor motor skills and epilepsy were associated with profound intellectual disability. The machine learning prediction model was able to adequately identify high likelihood of severe intellectual disability in teenagers with cerebral palsy.


DYNA ◽  
2020 ◽  
Vol 87 (212) ◽  
pp. 63-72
Author(s):  
Jorge Iván Pérez Rave ◽  
Favián González Echavarría ◽  
Juan Carlos Correa Morales

The objective of this work is to develop a machine learning model for online pricing of apartments in a Colombian context. This article addresses three aspects: i) it compares the predictive capacity of linear regression, regression trees, random forest and bagging; ii) it studies the effect of a group of text attributes on the predictive capability of the models; and iii) it identifies the more stable-important attributes and interprets them from an inferential perspective to better understand the object of study. The sample consists of 15,177 observations of real estate. The methods of assembly (random forest and bagging) show predictive superiority with respect to others. The attributes derived from the text had a significant relationship with the property price (on a log scale). However, their contribution to the predictive capacity was almost nil, since four different attributes achieved highly accurate predictions and remained stable when the sample change.


2019 ◽  
Vol 14 (3) ◽  
pp. 302-307
Author(s):  
Benjamin Q. Huynh ◽  
Sanjay Basu

ABSTRACTObjectives:Armed conflict has contributed to an unprecedented number of internally displaced persons (IDPs), individuals who are forced out of their homes but remain within their country. IDPs often urgently require shelter, food, and healthcare, yet prediction of when IDPs will migrate to an area remains a major challenge for aid delivery organizations. We sought to develop an IDP migration forecasting framework that could empower humanitarian aid groups to more effectively allocate resources during conflicts.Methods:We modeled monthly IDP migration between provinces within Syria and within Yemen using data on food prices, fuel prices, wages, location, time, and conflict reports. We compared machine learning methods with baseline persistence methods of forecasting.Results:We found a machine learning approach that more accurately forecast migration trends than baseline persistence methods. A random forest model outperformed the best persistence model in terms of root mean square error of log migration by 26% and 17% for the Syria and Yemen datasets, respectively.Conclusions:Integrating diverse data sources into a machine learning model appears to improve IDP migration prediction. Further work should examine whether implementation of such models can enable proactive aid allocation for IDPs in anticipation of forecast arrivals.


Sign in / Sign up

Export Citation Format

Share Document