scholarly journals Predicting COVID-19 Disease Progression with Chest CT Images

2020 ◽  
Author(s):  
Hongqin Liang ◽  
Xiaoming Qiu ◽  
Liqiang Zhu ◽  
Lihua Chen ◽  
Xiaofei Hu ◽  
...  

Abstract Background: Some mild patients can deteriorate to moderate or severe within a week with the natural progression of COVID-19.it has been crucial to early identify those mild cases and give timely treatment . The chest computed tomography (CT) has shown to be useful to assist clinical diagnosis of COVID-19.In this study, machine learning was used to develop an early-warning CT feature model for predicting mild patients with potential malignant progression.Methods:The total of 140 COVID-19 mild patients were collected. All patients at admission were divided into groups (alleviation group and exacerbation group) with or without malignant progression.The clinical and laboratory data at admission, the first CT, and the follow-up CT at critical stage of the two groups were compared with Chi-square test,.The CT features data (distribution, morphology,etc) were used to establish the prediction model by Fisher's linear discriminant method and Unconditional logistic regression algorithm. And the model was validated with 40 exception data.and the Area Under ROC curve (AUC) was used to evaluate the models.Results:The model filtered out three variables of CT features including distal air bronchogram, fibrosis,and reversed halo sign. Notably, the distal air bronchograms was less common in alleviation group, while the fibrosis and reversed halo sign were more common.The sensitivity, specificity and Youden index of unconditional logistic regression were 86.1%, 92.6% and 78.7%, For the analysis of Fisher's linear discriminant, the sensitivity, specificity and Youden index were 83.3%, 94.1% and 77.4%. The generalization ability of both models were consistent with sensitivity of 95.89%, specificity of 100%, and Youden index of 83.33%.Conclusions: The CT imaging features-based machine learning model has a high sensitivity for finding out the mild patients who are easy to deteriorate into severe/critical cases efficiently so that timely treatments came true for those patients,while largely help to relieve the medical pressure.

The customer buys the product based on many factors. There is no adequate and properly defined logic for such matter. The customer must satisfy when they see their product itself. They have to trust its quality, price, lifetime of the product, no side effect behavior, name of the product, packing of the product and finally cost. These factors may vary time to time, day to day and even sec to sec. The competition among sellers is also increasing day by day. The choice of choosing the product for customer is more, confused and risky also. Establishing a good relationship among seller and buyer will increase the customer. The retaining of customer is a challenging task. To solve this problem, a model is developed using machine learning algorithms svm, Naïve Bayes, Logistic Regression and fisher’s linear discriminant analysis. This model predicts the buying habit of a user/customer. The classification is performed on product purchase dataset and its performance is compared to find which algorithm performs well for this particular dataset. This work is implemented in R software.


Author(s):  
Kazutaka Uchida ◽  
Junichi Kouno ◽  
Shinichi Yoshimura ◽  
Norito Kinjo ◽  
Fumihiro Sakakibara ◽  
...  

AbstractIn conjunction with recent advancements in machine learning (ML), such technologies have been applied in various fields owing to their high predictive performance. We tried to develop prehospital stroke scale with ML. We conducted multi-center retrospective and prospective cohort study. The training cohort had eight centers in Japan from June 2015 to March 2018, and the test cohort had 13 centers from April 2019 to March 2020. We use the three different ML algorithms (logistic regression, random forests, XGBoost) to develop models. Main outcomes were large vessel occlusion (LVO), intracranial hemorrhage (ICH), subarachnoid hemorrhage (SAH), and cerebral infarction (CI) other than LVO. The predictive abilities were validated in the test cohort with accuracy, positive predictive value, sensitivity, specificity, area under the receiver operating characteristic curve (AUC), and F score. The training cohort included 3178 patients with 337 LVO, 487 ICH, 131 SAH, and 676 CI cases, and the test cohort included 3127 patients with 183 LVO, 372 ICH, 90 SAH, and 577 CI cases. The overall accuracies were 0.65, and the positive predictive values, sensitivities, specificities, AUCs, and F scores were stable in the test cohort. The classification abilities were also fair for all ML models. The AUCs for LVO of logistic regression, random forests, and XGBoost were 0.89, 0.89, and 0.88, respectively, in the test cohort, and these values were higher than the previously reported prediction models for LVO. The ML models developed to predict the probability and types of stroke at the prehospital stage had superior predictive abilities.


2015 ◽  
Vol 23 (e1) ◽  
pp. e2-e10 ◽  
Author(s):  
Sean Barnes ◽  
Eric Hamrock ◽  
Matthew Toerper ◽  
Sauleh Siddiqui ◽  
Scott Levin

Abstract Objective Hospitals are challenged to provide timely patient care while maintaining high resource utilization. This has prompted hospital initiatives to increase patient flow and minimize nonvalue added care time. Real-time demand capacity management (RTDC) is one such initiative whereby clinicians convene each morning to predict patients able to leave the same day and prioritize their remaining tasks for early discharge. Our objective is to automate and improve these discharge predictions by applying supervised machine learning methods to readily available health information. Materials and Methods The authors use supervised machine learning methods to predict patients’ likelihood of discharge by 2 p.m. and by midnight each day for an inpatient medical unit. Using data collected over 8000 patient stays and 20 000 patient days, the predictive performance of the model is compared to clinicians using sensitivity, specificity, Youden’s Index (i.e., sensitivity + specificity – 1), and aggregate accuracy measures. Results The model compared to clinician predictions demonstrated significantly higher sensitivity ( P  < .01), lower specificity ( P  < .01), and a comparable Youden Index ( P  > .10). Early discharges were less predictable than midnight discharges. The model was more accurate than clinicians in predicting the total number of daily discharges and capable of ranking patients closest to future discharge. Conclusions There is potential to use readily available health information to predict daily patient discharges with accuracies comparable to clinician predictions. This approach may be used to automate and support daily RTDC predictions aimed at improving patient flow.


2020 ◽  
Author(s):  
Yanli Zhao ◽  
Jirong Yue ◽  
Taiping Lin ◽  
Xuchao Peng ◽  
Dongmei Xie ◽  
...  

Abstract Background: Delirium is a common neuropsychiatric syndrome in older hospitalized patients. Previous studies have suggested that inflammation and oxidative stress contribute to the pathophysiology of delirium. However, it remains unclear whether neutrophil-lymphocyte ratio (NLR), an indicator of systematic inflammation, is associated with delirium. This study aimed to investigate the value of NLR as a predictor of delirium among older hospitalized patients.Methods: We conducted a prospective study of 740 hospitalized patients aged 70 years at the West China Hospital of Sichuan University. Neutrophil and lymphocyte counts were collected within 24 hours after hospital admission. Delirium was assessed on admission and every 48 hours thereafter. We used the Receiver operating characteristic analysis to assess the ability of the NLR for predicting delirium. The optimal cut-point value of the NLR was determined based on the highest Youden index (sensitivity + specificity - 1). Patients were categorized according to the cut-point value and quartiles of NLR, respectively. We then used logistic regression to identify the unadjusted and adjusted associations between NLR as a categorical variable and delirium. Results: The optimal cut-point value of NLR for predicting delirium was 3.626 (sensitivity: 75.2%; specificity: 63.4%; Youden index: 0.386). The incidence of delirium was significantly higher in patients with NLR >3.626 than NLR ≤3.626 (24.5% vs 5.8%; P<0.001). Significantly fewer patients in the first quartile of NLR experienced delirium than in the 3rd (4.3% vs 20.0%; P<0.001) and 4th quartiles of NLR (4.3% vs 24.9%; P<0.001). Multivariable logistic regression analysis showed that NLR was independently associated with delirium.Conclusions: NLR is a simple and practical marker that can predict the development of delirium in older hospitalized patients.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 2971-2971
Author(s):  
David Kuo ◽  
Maggie Wei ◽  
Jared Knickelbein ◽  
Karen Armbrust ◽  
Ian Yeung ◽  
...  

Abstract Background/Aim: The aim of this study was to assess the utility of intraocular IL-10 and IL-6 analysis by logistic regression in classifying primary vitreoretinal lymphoma (PVRL) vs. uveitis using a logistic regression model trained on a single-center retrospective cohort as well as the previously published ISOLD score compared against the IL-10/IL-6 ratio. Methods: Patient diagnoses of PVRL vs. uveitis and associated aqueous and/or vitreous IL-6 and IL-10 levels were retrospectively collected. From this data, cytokine levels were compared between diagnoses with the Mann-Whitney U test and a logistic regression model was developed to classify PVRL vs. uveitis from aqueous and vitreous IL-6 and IL-10 by nested cross-validation. ROC curves were plotted and AUCs were calculated for the IL-10/IL-6 ratio, ISOLD score, and our logistic regression model. Optimal cut-offs for each classifier were determined by the maximal Youden index; and sensitivity, specificity, PPV, and NPV were determined for each cut-off. Results: 79 lymphoma (10 aqueous, 69 vitreous) and 84 uveitis patients (19 aqueous, 65 vitreous) between 10/5/1999 and 9/16/2015 were included in the study. IL-6 was higher in uveitis vs. lymphoma patients while IL-10 was higher in lymphoma vs. uveitis patients (p <0.01 for all comparisons). For vitreous samples, our logistic regression model achieved an AUC of 98.3%, while ISOLD achieved an AUC of 97.8%, and the IL-10/IL-6 ratio achieved an AUC of 96.3%. The optimal cut-offs for our logistic regression model, ISOLD, and the IL-10/IL-6 ratio achieved sensitivity/specificity of 92.7%/100%, 94.2%/96.9%, 94.2%/95.3% respectively, corresponding to PPV/NPV of 100%/92.9%, 97%/94%, and 95.6%/93.9% respectively. For aqueous samples, all three classifiers achieved 100% AUC with 100% sensitivity/specificity. Odds ratios of PVRL vs. uveitis were 0.981 (aqueous) and 0.992 (vitreous) for IL-6 and 1.030 (aqueous) and 1.060 (vitreous) for IL-10 according to our logistic regression model. Conclusion: In this study, logistic regression, as demonstrated by our model and the ISOLD score, showed strong classification performance and generalizability with high sensitivity and specificity. These results, in addition to logistic regression's ability to further improve with more training data suggest a promising step forward in intraocular cytokine analysis for the early diagnosis of primary vitreoretinal lymphoma. Additional validation studies, especially with cohorts that have proven challenging for the IL-10/IL-6 ratio, would further elucidate the strengths and weakness of this approach. Disclosures No relevant conflicts of interest to declare.


2021 ◽  
Vol 6 (4) ◽  
pp. 295-306
Author(s):  
Ananda B. W. Manage ◽  
Ram C. Kafle ◽  
Danush K. Wijekularathna

In cricket, all-rounders play an important role. A good all-rounder should be able to contribute to the team by both bat and ball as needed. However, these players still have their dominant role by which we categorize them as batting all-rounders or bowling all-rounders. Current practice is to do so by mostly subjective methods. In this study, the authors have explored different machine learning techniques to classify all-rounders into bowling all-rounders or batting all-rounders based on their observed performance statistics. In particular, logistic regression, linear discriminant function, quadratic discriminant function, naïve Bayes, support vector machine, and random forest classification methods were explored. Evaluation of the performance of the classification methods was done using the metrics accuracy and area under the ROC curve. While all the six methods performed well, logistic regression, linear discriminant function, quadratic discriminant function, and support vector machine showed outstanding performance suggesting that these methods can be used to develop an automated classification rule to classify all-rounders in cricket. Given the rising popularity of cricket, and the increasing revenue generated by the sport, the use of such a prediction tool could be of tremendous benefit to decision-makers in cricket.


10.2196/20268 ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. e20268
Author(s):  
Adrienne Kline ◽  
Theresa Kline ◽  
Zahra Shakeri Hossein Abad ◽  
Joon Lee

Background Supervised machine learning (ML) is being featured in the health care literature with study results frequently reported using metrics such as accuracy, sensitivity, specificity, recall, or F1 score. Although each metric provides a different perspective on the performance, they remain to be overall measures for the whole sample, discounting the uniqueness of each case or patient. Intuitively, we know that all cases are not equal, but the present evaluative approaches do not take case difficulty into account. Objective A more case-based, comprehensive approach is warranted to assess supervised ML outcomes and forms the rationale for this study. This study aims to demonstrate how the item response theory (IRT) can be used to stratify the data based on how difficult each case is to classify, independent of the outcome measure of interest (eg, accuracy). This stratification allows the evaluation of ML classifiers to take the form of a distribution rather than a single scalar value. Methods Two large, public intensive care unit data sets, Medical Information Mart for Intensive Care III and electronic intensive care unit, were used to showcase this method in predicting mortality. For each data set, a balanced sample (n=8078 and n=21,940, respectively) and an imbalanced sample (n=12,117 and n=32,910, respectively) were drawn. A 2-parameter logistic model was used to provide scores for each case. Several ML algorithms were used in the demonstration to classify cases based on their health-related features: logistic regression, linear discriminant analysis, K-nearest neighbors, decision tree, naive Bayes, and a neural network. Generalized linear mixed model analyses were used to assess the effects of case difficulty strata, ML algorithm, and the interaction between them in predicting accuracy. Results The results showed significant effects (P<.001) for case difficulty strata, ML algorithm, and their interaction in predicting accuracy and illustrated that all classifiers performed better with easier-to-classify cases and that overall the neural network performed best. Significant interactions suggest that cases that fall in the most arduous strata should be handled by logistic regression, linear discriminant analysis, decision tree, or neural network but not by naive Bayes or K-nearest neighbors. Conventional metrics for ML classification have been reported for methodological comparison. Conclusions This demonstration shows that using the IRT is a viable method for understanding the data that are provided to ML algorithms, independent of outcome measures, and highlights how well classifiers differentiate cases of varying difficulty. This method explains which features are indicative of healthy states and why. It enables end users to tailor the classifier that is appropriate to the difficulty level of the patient for personalized medicine.


Diagnostics ◽  
2020 ◽  
Vol 10 (6) ◽  
pp. 415 ◽  
Author(s):  
Bomi Jeong ◽  
Hyunjeong Cho ◽  
Jieun Kim ◽  
Soon Kil Kwon ◽  
SeungWoo Hong ◽  
...  

This study aims to compare the classification performance of statistical models on highly imbalanced kidney data. The health examination cohort database provided by the National Health Insurance Service in Korea is utilized to build models with various machine learning methods. The glomerular filtration rate (GFR) is used to diagnose chronic kidney disease (CKD). It is calculated using the Modification of Diet in Renal Disease method and classified into five stages (1, 2, 3A and 3B, 4, and 5). Different CKD stages based on the estimated GFR are considered as six classes of the response variable. This study utilizes two representative generalized linear models for classification, namely, multinomial logistic regression (multinomial LR) and ordinal logistic regression (ordinal LR), as well as two machine learning models, namely, random forest (RF) and autoencoder (AE). The classification performance of the four models is compared in terms of accuracy, sensitivity, specificity, precision, and F1-Measure. To find the best model that classifies CKD stages correctly, the data are divided into a 10-fold dataset with the same rate for each CKD stage. Results indicate that RF and AE show better performance in accuracy than the multinomial and ordinal LR models when classifying the response variable. However, when a highly imbalanced dataset is modeled, the accuracy of the model performance can distort the actual performance. This occurs because accuracy is high even if a statistical model classifies a minority class into a majority class. To solve this problem in performance interpretation, we not only consider accuracy from the confusion matrix but also sensitivity, specificity, precision, and F-1 measure for each class. To present classification performance with a single value for each model, we calculate the macro-average and micro-weighted values for each model. We conclude that AE is the best model classifying CKD stages correctly for all performance indices.


Author(s):  
Peian Hu ◽  
Lei Chen ◽  
Zhengrong Zhou

AbstractMachine learning has been widely used in the characterization of tumors recently. This article aims to explore the feasibility of the whole tumor fat-suppressed (FS) T2WI and ADC features-based least absolute shrinkage and selection operator (LASSO)-logistic predictive models in the differentiation of soft tissue neoplasms (STN). The clinical and MR findings of 160 cases with 161 histologically proven STN were reviewed, retrospectively, 75 with diffusion-weighted imaging (DWI with b values of 50, 400, and 800 s/mm2). They were divided into benign and malignant groups and further divided into training (70%) and validation (30%) cohorts. The MR FS T2WI and ADC features-based LASSO-logistic models were built and compared. The AUC of the FS T2WI features-based LASSO-logistic regression model for benign and malignant prediction was 0.65 and 0.75 for the training and validation cohorts. The model’s sensitivity, specificity, and accuracy of the validation cohort were 55%, 96%, and 76.6%. While the AUC of the ADC features-based model was 0.932 and 0.955 for the training and validation cohorts. The model’s sensitivity, specificity, and accuracy were 83.3%, 100%, and 91.7%. The performances of these models were also validated by decision curve analysis (DCA). The AUC of the whole tumor ADC features-based LASSO-logistic regression predictive model was larger than that of FS T2WI features (p = 0.017). The whole tumor fat-suppressed T2WI and ADC features-based LASSO-logistic predictive models both can serve as useful tools in the differentiation of STN. ADC features-based LASSO-logistic regression predictive model did better than that of FS T2WI features.


1995 ◽  
Vol 3 ◽  
pp. 373-382 ◽  
Author(s):  
M. Buro

This article describes an application of three well-known statistical methods in the field of game-tree search: using a large number of classified Othello positions, feature weights for evaluation functions with a game-phase-independent meaning are estimated by means of logistic regression, Fisher's linear discriminant, and the quadratic discriminant function for normally distributed features. Thereafter, the playing strengths are compared by means of tournaments between the resulting versions of a world-class Othello program. In this application, logistic regression - which is used here for the first time in the context of game playing - leads to better results than the other approaches.


Sign in / Sign up

Export Citation Format

Share Document