Machine-learning algorithm that can improve the diagnostic accuracy of septic arthritis of the knee

Knee Surgery Sports Traumatology Arthroscopy ◽

10.1007/s00167-020-06418-2 ◽

2021 ◽

Author(s):

Eun-Seok Choi ◽

Jae Ang Sim ◽

Young Gon Na ◽

Jong- Keun Seon ◽

Hyun Dae Shin

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Confidence Interval ◽

Diagnostic Accuracy ◽

Septic Arthritis ◽

Inflammatory Arthritis ◽

Supervised Machine Learning ◽

Diagnostic Model ◽

Extreme Gradient Boosting ◽

Wbc Count

Abstract Purpose Prompt diagnosis and treatment of septic arthritis of the knee is crucial. Nevertheless, the quality of evidence for the diagnosis of septic arthritis is low. In this study, the authors developed a machine learning-based diagnostic algorithm for septic arthritis of the native knee using clinical data in an emergency department and validated its diagnostic accuracy. Methods Patients (n = 326) who underwent synovial fluid analysis at the emergency department for suspected septic arthritis of the knee were enrolled. Septic arthritis was diagnosed in 164 of the patients (50.3%) using modified Newman criteria. Clinical characteristics of septic and inflammatory arthritis were compared. Area under the receiver-operating characteristic (ROC) curve (AUC) statistics was applied to evaluate the efficacy of each variable for the diagnosis of septic arthritis. The dataset was divided into independent training and test sets (comprising 80% and 20%, respectively, of the data). Supervised machine-learning techniques (random forest and eXtreme Gradient Boosting: XGBoost) were applied to develop a diagnostic model using the training dataset. The test dataset was subsequently used to validate the developed model. The ROC curves of the machine-learning model and each variable were compared. Results Synovial white blood cell (WBC) count was significantly higher in septic arthritis than in inflammatory arthritis in the multivariate analysis (P = 0.001). In the ROC comparison analysis, synovial WBC count yielded a significantly higher AUC than all other single variables (P = 0.002). The diagnostic model using the XGBoost algorithm yielded a higher AUC (0.831, 95% confidence interval 0.751–0.923) than synovial WBC count (0.740, 95% confidence interval 0.684–0.791; P = 0.033). The developed algorithm was deployed as a free access web-based application (www.septicknee.com). Conclusion The diagnosis of septic arthritis of the knee might be improved using a machine learning-based prediction model. Level of evidence Diagnostic study Level III (Case–control study).

Download Full-text

Machine learning associated with respiratory oscillometry: a computer-aided diagnosis system for the detection of respiratory abnormalities in systemic sclerosis

BioMedical Engineering OnLine ◽

10.1186/s12938-021-00865-9 ◽

2021 ◽

Vol 20 (1) ◽

Author(s):

Domingos S. M. Andrade ◽

Luigi Maciel Ribeiro ◽

Agnaldo J. Lopes ◽

Jorge L. M. Amaral ◽

Pedro L. Melo

Keyword(s):

Machine Learning ◽

Systemic Sclerosis ◽

Diagnostic Accuracy ◽

Group Versus ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Control Group ◽

Extreme Gradient Boosting

Abstract Introduction The use of machine learning (ML) methods would improve the diagnosis of respiratory changes in systemic sclerosis (SSc). This paper evaluates the performance of several ML algorithms associated with the respiratory oscillometry analysis to aid in the diagnostic of respiratory changes in SSc. We also find out the best configuration for this task. Methods Oscillometric and spirometric exams were performed in 82 individuals, including controls (n = 30) and patients with systemic sclerosis with normal (n = 22) and abnormal (n = 30) spirometry. Multiple instance classifiers and different supervised machine learning techniques were investigated, including k-Nearest Neighbors (KNN), Random Forests (RF), AdaBoost with decision trees (ADAB), and Extreme Gradient Boosting (XGB). Results and discussion The first experiment of this study showed that the best oscillometric parameter (BOP) was dynamic compliance, which provided moderate accuracy (AUC = 0.77) in the scenario control group versus patients with sclerosis and normal spirometry (CGvsPSNS). In the scenario control group versus patients with sclerosis and altered spirometry (CGvsPSAS), the BOP obtained high accuracy (AUC = 0.94). In the second experiment, the ML techniques were used. In CGvsPSNS, KNN achieved the best result (AUC = 0.90), significantly improving the accuracy in comparison with the BOP (p < 0.01), while in CGvsPSAS, RF obtained the best results (AUC = 0.97), also significantly improving the diagnostic accuracy (p < 0.05). In the third, fourth, fifth, and sixth experiments, different feature selection techniques allowed us to spot the best oscillometric parameters. They resulted in a small increase in diagnostic accuracy in CGvsPSNS (respectively, 0.87, 0.86, 0.82, and 0.84), while in the CGvsPSAS, the best classifier's performance remained the same (AUC = 0.97). Conclusions Oscillometric principles combined with machine learning algorithms provide a new method for diagnosing respiratory changes in patients with systemic sclerosis. The present study's findings provide evidence that this combination may help in the early diagnosis of respiratory changes in these patients.

Download Full-text

A Machine Learning Model for Predicting Unscheduled 72 h Return Visits to the Emergency Department by Patients with Abdominal Pain

Diagnostics ◽

10.3390/diagnostics12010082 ◽

2021 ◽

Vol 12 (1) ◽

pp. 82

Author(s):

Chun-Chuan Hsu ◽

Cheng-CJ Chu ◽

Ching-Heng Lin ◽

Chien-Hsiung Huang ◽

Chip-Jin Ng ◽

...

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Abdominal Pain ◽

Learning Model ◽

Supervised Machine Learning ◽

Support Vector ◽

Research Database ◽

Machine Learning Model ◽

Extreme Gradient Boosting ◽

Return Visits

Seventy-two-hour unscheduled return visits (URVs) by emergency department patients are a key clinical index for evaluating the quality of care in emergency departments (EDs). This study aimed to develop a machine learning model to predict 72 h URVs for ED patients with abdominal pain. Electronic health records data were collected from the Chang Gung Research Database (CGRD) for 25,151 ED visits by patients with abdominal pain and a total of 617 features were used for analysis. We used supervised machine learning models, namely logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGB), and voting classifier (VC), to predict URVs. The VC model achieved more favorable overall performance than other models (AUROC: 0.74; 95% confidence interval (CI), 0.69–0.76; sensitivity, 0.39; specificity, 0.89; F1 score, 0.25). The reduced VC model achieved comparable performance (AUROC: 0.72; 95% CI, 0.69–0.74) to the full models using all clinical features. The VC model exhibited the most favorable performance in predicting 72 h URVs for patients with abdominal pain, both for all-features and reduced-features models. Application of the VC model in the clinical setting after validation may help physicians to make accurate decisions and decrease URVs.

Download Full-text

Machine Learning Associated With Respiratory Oscillometry: A Computer-Aided Diagnosis System for the Detection of Respiratory Abnormalities in Systemic Sclerosis

10.21203/rs.3.rs-144194/v1 ◽

2021 ◽

Author(s):

Domingos Andrade ◽

Luigi Ribeiro ◽

Agnaldo Lopes ◽

Jorge Amaral ◽

Pedro Lopes de Melo

Keyword(s):

Machine Learning ◽

Systemic Sclerosis ◽

Diagnostic Accuracy ◽

Learning Algorithm ◽

Group Versus ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Control Group ◽

Extreme Gradient Boosting

Abstract BackgroundThe use of machine learning (ML) methods would improve the diagnosis of respiratory changes in systemic sclerosis (SSc). This paper evaluates the performance of several ML algorithms associated with the respiratory oscillometry analysis to aid in the diagnostic of respiratory changes in SSc. We also find out the best configuration for this task.MethodsOscillometric and spirometric exams were performed in 82 individuals, including controls (n=30), and patients with systemic sclerosis with normal (n=22) and abnormal (n=30) spirometry. Multiple instance classifiers and different supervised machine learning techniques were investigated, including k-nearest neighbours (KNN), random forests (RF), AdaBoost with decision trees (ADAB), and Extreme Gradient Boosting (XGB).Results and discussionThe first experiment of this study showed that the best oscillometric parameter (BOP) was dynamic compliance. In the scenario Control Group versus Patients with Sclerosis and normal spirometry (CGvsPSNS), it provided moderate accuracy (AUC=0.77). In the scenario Control Group versus Patients with Sclerosis and Altered spirometry (CGvsPSAS), the BOP obtained high accuracy (AUC=0.94). In the second experiment, the ML techniques were used. In CGvsPSNS, KNN achieved the best result (AUC=0.90), significantly improving the accuracy in comparison with the BOP (p<0.01), while in CGvsPSAS, RF obtained the best results (AUC=0.97), also significantly improving the diagnostic accuracy (p<0.05). In the third, fourth, fifth, and sixth experiments, the use of different feature selection techniques allowed us to spot the best oscillometric parameters. They all show a small increase in diagnostic accuracy in CGvsPSNS, respectively 0.87,0.86, 0.82, 0.84, while in the CGvsPSAS, the performance of the best classifier remained the same (AUC=0.97). ConclusionsOscillometric principles combined with machine learning algorithm provides a new method for the diagnosis of respiratory changes in patients with systemic sclerosis. The findings of the present study provide evidence that this combination may play an important role in the early diagnosis of respiratory changes in these patients.

Download Full-text

Development and validation of a difficult laryngoscopy prediction model using machine learning of neck circumference and thyromental height

BMC Anesthesiology ◽

10.1186/s12871-021-01343-4 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Jong Ho Kim ◽

Haewon Kim ◽

Ji Su Jang ◽

Sung Mi Hwang ◽

So Young Lim ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Confidence Interval ◽

Neck Circumference ◽

Difficult Laryngoscopy ◽

Gradient Boosting ◽

Test Set ◽

Equal Distribution ◽

Light Gradient ◽

Extreme Gradient Boosting

Abstract Background Predicting difficult airway is challengeable in patients with limited airway evaluation. The aim of this study is to develop and validate a model that predicts difficult laryngoscopy by machine learning of neck circumference and thyromental height as predictors that can be used even for patients with limited airway evaluation. Methods Variables for prediction of difficulty laryngoscopy included age, sex, height, weight, body mass index, neck circumference, and thyromental distance. Difficult laryngoscopy was defined as Grade 3 and 4 by the Cormack-Lehane classification. The preanesthesia and anesthesia data of 1677 patients who had undergone general anesthesia at a single center were collected. The data set was randomly stratified into a training set (80%) and a test set (20%), with equal distribution of difficulty laryngoscopy. The training data sets were trained with five algorithms (logistic regression, multilayer perceptron, random forest, extreme gradient boosting, and light gradient boosting machine). The prediction models were validated through a test set. Results The model’s performance using random forest was best (area under receiver operating characteristic curve = 0.79 [95% confidence interval: 0.72–0.86], area under precision-recall curve = 0.32 [95% confidence interval: 0.27–0.37]). Conclusions Machine learning can predict difficult laryngoscopy through a combination of several predictors including neck circumference and thyromental height. The performance of the model can be improved with more data, a new variable and combination of models.

Download Full-text

Emergency department admissions during COVID-19: explainable machine learning to characterise data drift and detect emergent health risks

10.1101/2021.05.27.21257713 ◽

2021 ◽

Author(s):

Christopher Duckworth ◽

Francis P Chmiel ◽

Dan K. Burns ◽

Zlatko D Zlatev ◽

Neil M White ◽

...

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Predictive Model ◽

Clinical Outcomes ◽

Health Risks ◽

Concept Drift ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Healthcare Settings ◽

Acute Healthcare

Supervised machine learning algorithms deployed in acute healthcare settings use data describing historical episodes to predict clinical outcomes. Clinical settings are dynamic environments and the underlying data distributions characterising episodes can change with time (a phenomenon known as data drift), and so can the relationship between episode characteristics and associated clinical outcomes (so-called, concept drift). We demonstrate how explainable machine learning can be used to monitor data drift in a predictive model deployed within a hospital emergency department. We use the COVID-19 pandemic as an exemplar cause of data drift, which has brought a severe change in operational circumstances. We present a machine learning classifier trained using (pre-COVID-19) data, to identify patients at high risk of admission to hospital during an emergency department attendance. We evaluate our model's performance on attendances occurring pre-pandemic (AUROC 0.856 95\%CI [0.852, 0.859]) and during the COVID-19 pandemic (AUROC 0.826 95\%CI [0.814, 0.837]). We demonstrate two benefits of explainable machine learning (SHAP) for models deployed in healthcare settings: (1) By tracking the variation in a feature's SHAP value relative to its global importance, a complimentary measure of data drift is found which highlights the need to retrain a predictive model. (2) By observing the relative changes in feature importance emergent health risks can be identified.

Download Full-text

XGBoost Algorithm Prediction of Critical Care Outcome For Adult Patients Presenting To Emergency Department Using Initial Triage Information (Preprint)

10.2196/preprints.30770 ◽

2021 ◽

Author(s):

Hyoungju Yun ◽

Jinwook Choi ◽

Jeong Ho Park

Keyword(s):

Machine Learning ◽

Blood Pressure ◽

Emergency Department ◽

Critical Care ◽

Adult Patients ◽

Baseline Model ◽

Discriminative Performance ◽

Extreme Gradient Boosting ◽

Lemeshow Test ◽

Care Outcome

BACKGROUND Emergency department (ED) triage system to classify and prioritize patients at high risk from less urgent continues to be a challenge. OBJECTIVE This study, comprising 80,433 patients, aims to develop a machine learning algorithm prediction model of critical care outcome for adult patients using information collected during ED triage and compare the performance with that of the baseline model using Korean Triage and Acuity Scale (KTAS). METHODS To predict the need of critical care, we used 13 predictors from triage information: age, gender, mode of ED arrival, time interval between onset and ED arrival, reason of ED visit, chief complaints, systolic blood pressure, diastolic blood pressure, pulse rate, respiratory rate, body temperature, oxygen saturation and level of consciousness. The baseline model with KTAS was developed using logistic regression and the machine learning model with 13 variables was generated using extreme gradient boosting (XGB) and deep neural network (DNN) algorithms. The discrimination was measured by area under the receiver operating characteristic curve(AUC). The ability of calibration with Hosmer–Lemeshow test and reclassification with net reclassification index (NRI) were evaluated. The calibration plot and partial dependence plot were used in analysis. RESULTS The AUC of the model with the full set of variables (0.833–0.861) was better than that of the baseline model (0.796). The XGB model of AUC 0.861 (0.848, 0.874, 95% CI) showed a higher discriminative performance than DNN model of 0.833(0.819, 0.848). The XGB and DNN models proved better reclassification than the baseline model with positive net reclassification index. The XGB models was well calibrated (Hosmer-Lemeshow test p>0.05); however, the DNN showed poor calibration power (H-L test p<0.001). We further interpreted non-linear association between variables and critical care prediction. CONCLUSIONS Our study demonstrated that the performance of the XGB model using initial information at ED triage for predicting patients in need of critical care outperformed the conventional model with KTAS.

Download Full-text

Using Machine Learning to Predict Invasive Bacterial Infections in Young Febrile Infants Visiting the Emergency Department

Journal of Clinical Medicine ◽

10.3390/jcm10091875 ◽

2021 ◽

Vol 10 (9) ◽

pp. 1875

Author(s):

I-Min Chiu ◽

Chi-Yung Cheng ◽

Wun-Huei Zeng ◽

Ying-Hsien Huang ◽

Chun-Hung Richard Lin

Keyword(s):

Machine Learning ◽

Emergency Department ◽

Bacterial Infections ◽

Clinical Symptoms ◽

Learning Algorithm ◽

Gradient Boosting ◽

P Value ◽

Young Infants ◽

Extreme Gradient Boosting ◽

Sensitivity Level

Background: The aim of this study was to develop and evaluate a machine learning (ML) model to predict invasive bacterial infections (IBIs) in young febrile infants visiting the emergency department (ED). Methods: This retrospective study was conducted in the EDs of three medical centers across Taiwan from 2011 to 2018. We included patients age in 0–60 days who were visiting the ED with clinical symptoms of fever. We developed three different ML algorithms, including logistic regression (LR), supportive vector machine (SVM), and extreme gradient boosting (XGboost), comparing their performance at predicting IBIs to a previous validated score system (IBI score). Results: During the study period, 4211 patients were included, where 126 (3.1%) had IBI. A total of eight, five, and seven features were used in the LR, SVM, and XGboost through the feature selection process, respectively. The ML models can achieve a better AUROC value when predicting IBIs in young infants compared with the IBI score (LR: 0.85 vs. SVM: 0.84 vs. XGBoost: 0.85 vs. IBI score: 0.70, p-value < 0.001). Using a cost sensitive learning algorithm, all ML models showed better specificity in predicting IBIs at a 90% sensitivity level compared to an IBI score > 2 (LR: 0.59 vs. SVM: 0.60 vs. XGBoost: 0.57 vs. IBI score >2: 0.43, p-value < 0.001). Conclusions: All ML models developed in this study outperformed the traditional scoring system in stratifying low-risk febrile infants after the standardized sensitivity level.

Download Full-text

Machine learning deciphers structural features of RNA duplexes measured with solution X-ray scattering

IUCrJ ◽

10.1107/s2052252520008830 ◽

2020 ◽

Vol 7 (5) ◽

pp. 870-880

Author(s):

Yen-Lin Chen ◽

Lois Pollack

Keyword(s):

Machine Learning ◽

Structural Information ◽

Structural Parameters ◽

Supervised Machine Learning ◽

Scattering Data ◽

Length Scales ◽

X Ray ◽

X Ray Scattering ◽

Extreme Gradient Boosting ◽

Ray Scattering

Macromolecular structures can be determined from solution X-ray scattering. Small-angle X-ray scattering (SAXS) provides global structural information on length scales of 10s to 100s of Ångstroms, and many algorithms are available to convert SAXS data into low-resolution structural envelopes. Extension of measurements to wider scattering angles (WAXS or wide-angle X-ray scattering) can sharpen the resolution to below 10 Å, filling in structural details that can be critical for biological function. These WAXS profiles are especially challenging to interpret because of the significant contribution of solvent in addition to solute on these smaller length scales. Based on training with molecular dynamics generated models, the application of extreme gradient boosting (XGBoost) is discussed, which is a supervised machine learning (ML) approach to interpret features in solution scattering profiles. These ML methods are applied to predict key structural parameters of double-stranded ribonucleic acid (dsRNA) duplexes. Duplex conformations vary with salt and sequence and directly impact the foldability of functional RNA molecules. The strong structural periodicities in these duplexes yield scattering profiles with rich sets of features at intermediate-to-wide scattering angles. In the ML models, these profiles are treated as 1D images or features. These ML models identify specific scattering angles, or regions of scattering angles, which correspond with and successfully predict distinct structural parameters. Thus, this work demonstrates that ML strategies can integrate theoretical molecular models with experimental solution scattering data, providing a new framework for extracting highly relevant structural information from solution experiments on biological macromolecules.

Download Full-text

Identification of five important genes to predict GBM subtypes

Neuro-Oncology Advances ◽

10.1093/noajnl/vdab144 ◽

2021 ◽

Author(s):

Yang Tang ◽

Maleeha A Qazi ◽

Kevin R Brown ◽

Nicholas Mikolajewicz ◽

Jason Moffat ◽

...

Keyword(s):

Machine Learning ◽

Gene Signature ◽

Supervised Machine Learning ◽

Gradient Boosting ◽

Tissue Cell ◽

Learning Approach ◽

Gene Set ◽

Primary Brain Tumour ◽

Extreme Gradient Boosting ◽

Machine Learning Approach

Abstract Background Glioblastoma (GBM), the most common and aggressive primary brain tumour in adults, has been classified into three subtypes: classical, mesenchymal and proneural. While the original classification relied on an 840 gene-set, further clarification on true GBM subtypes uses a 150-gene signature to accurately classify GBM into the three subtypes. We hypothesized whether a machine learning approach could be used to identify a smaller gene-set to accurately predict GBM subtype. Methods Using a supervised machine learning approach, extreme gradient boosting (XGBoost), we developed a classifier to predict the three subtypes of glioblastoma (GBM): classical, mesenchymal and proneural. We tested the classifier on in-house GBM tissue, cell lines and xenograft samples to predict their subtype. Results We identified the five most important genes for characterizing the three subtypes based on genes that often exhibited high Importance Scores in our XGBoost analyses. On average, this approach achieved 80.12% accuracy in predicting these three subtypes of GBM. Furthermore, we applied our five-gene classifier to successfully predict the subtype of GBM samples at our centre. Conclusion Our 5-gene set classifier is the smallest classifier to date that can predict GBM subtypes with high accuracy, which could facilitate the future development of a five-gene subtype diagnostic biomarker for routine assays in GBM samples.

Download Full-text

An efficient mixture of deep and machine learning models for COVID-19 diagnosis in chest X-ray images

PLoS ONE ◽

10.1371/journal.pone.0242535 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0242535

Author(s):

Dingding Wang ◽

Jiaqing Mo ◽

Gang Zhou ◽

Liang Xu ◽

Yajun Liu

Keyword(s):

Machine Learning ◽

Diagnostic Accuracy ◽

Human Life ◽

Diagnostic Model ◽

Learning Models ◽

X Ray ◽

Machine Learning Classification ◽

X Ray Imaging ◽

Chest X Ray ◽

Sensitivity Specificity

A newly emerged coronavirus (COVID-19) seriously threatens human life and health worldwide. In coping and fighting against COVID-19, the most critical step is to effectively screen and diagnose infected patients. Among them, chest X-ray imaging technology is a valuable imaging diagnosis method. The use of computer-aided diagnosis to screen X-ray images of COVID-19 cases can provide experts with auxiliary diagnosis suggestions, which can reduce the burden of experts to a certain extent. In this study, we first used conventional transfer learning methods, using five pre-trained deep learning models, which the Xception model showed a relatively ideal effect, and the diagnostic accuracy reached 96.75%. In order to further improve the diagnostic accuracy, we propose an efficient diagnostic method that uses a combination of deep features and machine learning classification. It implements an end-to-end diagnostic model. The proposed method was tested on two datasets and performed exceptionally well on both of them. We first evaluated the model on 1102 chest X-ray images. The experimental results show that the diagnostic accuracy of Xception + SVM is as high as 99.33%. Compared with the baseline Xception model, the diagnostic accuracy is improved by 2.58%. The sensitivity, specificity and AUC of this model reached 99.27%, 99.38% and 99.32%, respectively. To further illustrate the robustness of our method, we also tested our proposed model on another dataset. Finally also achieved good results. Compared with related research, our proposed method has higher classification accuracy and efficient diagnostic performance. Overall, the proposed method substantially advances the current radiology based methodology, it can be very helpful tool for clinical practitioners and radiologists to aid them in diagnosis and follow-up of COVID-19 cases.

Download Full-text