Model with the GBDT for colorectal adenoma risk diagnosis

2019 ◽  
Vol 14 ◽  
Author(s):  
Junbo Gao ◽  
Lifeng Zhang ◽  
Gaiqing Yu ◽  
Guoqiang Qu ◽  
Yanfeng Li ◽  
...  

Background and objective: Colorectal cancer (CRC) is a common malignant tumor of the digestive system; it is associated with high morbidity and mortality. However, an early prediction of colorectal adenoma (CRA) that is a precancerous disease of most CRC patients provides an opportunity to make an appropriate strategy for prevention, early diagnosis and treatment. We aimed to build a machine learning model to predict CRA that could assist physicians in classifying high-risk patients and make informed choices, prevent CRC. Methods: We instructed patients who had undergone a colonoscopy to fill out a questionnaire at the Sixth People Hospital of Shanghai in China from July 2018 to November 2018. A classification model with the gradient boosting decision tree (GBDT) was developed to predict CRA. This model was compared with three other models, namely, random forest (RF), support vector machine (SVM), and logistic regression (LR). The area under the receiver operating characteristic curve (AUC) was used to evaluate performance of the models. Results: Among the 245 included patients, 65 patients had CRA. The area under the receiver operating characteristic (AUCs) of GBDT, RF, SVM ,and LR with 10 fold-cross validation were 0.8131, 0.74, 0.769 and 0.763. We also built an online prediction service, CRA Inference System, to substantialize the proposed solution for patients with CRA. Conclusion: We developed and compared four classification models for CRA prediction, and the GBDT model showed the highest performance. Implementing a GBDT model for screening can reduce the cost of time and money and help physicians identify high-risk groups for primary prevention.

Stroke ◽  
2021 ◽  
Author(s):  
Murad Megjhani ◽  
Kalijah Terilli ◽  
Miriam Weiss ◽  
Jude Savarraj ◽  
Li Hui Chen ◽  
...  

Background and Purpose: Delayed cerebral ischemia (DCI) after aneurysmal subarachnoid hemorrhage negatively impacts long-term recovery but is often detected too late to prevent damage. We aim to develop hourly risk scores using routinely collected clinical data to detect DCI. Methods: A DCI classification model was trained using vital sign measurements (heart rate, blood pressure, respiratory rate, and oxygen saturation) and demographics routinely collected for clinical care. Twenty-two time-varying physiological measures were computed including mean, SD, and cross-correlation of heart rate time series with each of the other vitals. Classification was achieved using an ensemble approach with L2-regularized logistic regression, random forest, and support vector machines models. Classifier performance was determined by area under the receiver operating characteristic curves and confusion matrices. Hourly DCI risk scores were generated as the posterior probability at time t using the Ensemble classifier on cohorts recruited at 2 external institutions (n=38 and 40). Results: Three hundred ten patients were included in the training model (median, 54 years old [interquartile range, 45–65]; 80.2% women, 28.4% Hunt and Hess scale 4–5, 38.7% Modified Fisher Scale 3–4); 101 (33%) developed DCI with a median onset day 6 (interquartile range, 5–8). Classification accuracy before DCI onset was 0.83 (interquartile range, 0.76–0.83) area under the receiver operating characteristic curve. Risk scores applied to external institution datasets correctly predicted 64% and 91% of DCI events as early as 12 hours before clinical detection, with 2.7 and 1.6 true alerts for every false alert. Conclusions: An hourly risk score for DCI derived from routine vital signs may have the potential to alert clinicians to DCI, which could reduce neurological injury.


2015 ◽  
Vol 28 (2) ◽  
pp. 189
Author(s):  
Ana Salselas ◽  
Inês Pestana ◽  
Francisco Bischoff ◽  
Mariana Guimarães ◽  
Joaquim Aguiar Andrade

<strong>Introduction:</strong> Pregnant women with thromboembolic diseases, previous thrombotic episodes or thrombophilia family history were supervised in a multidisciplinary Obstetrics/ Hematology consultation in Centro Hospitalar São João EPE, Porto, Portugal. For the evaluation and medication of these women, a risk stratification scale was used.<br /><strong>Purposes:</strong> The aim of this study was to validate a Risk Stratification Scale and thromboprophylaxis protocol by means of comparing it with a similar scale, developed and published by Sarig.<br /><strong>Material and Methods:</strong> We have compared: The distribution, by risk groups, obtained through the application of the two scales on pregnant women followed at Centro Hospitalar São João, Porto, Portugal, consultation; the sensibility and specificity for each one of the scales (DeLong scale, applied to Receiver Operating Characteristic) curves; the outcomes in pregnancies followed in Hospital São João, Porto, Portugal<br /><strong>Results:</strong> According to our Hema-Obs risk stratification scale, 29% were allocated to low-risk, 47% to high-risk and 24% to very-high-risk groups. According to Galit Sarig risk stratification scale, 24% were considered low-risk, 53% moderate, 16% high-risk and 7% as very high-risk group. In our study we observed 9% of spontaneous abortions, in comparison with 18% in the Galit Sarig cohort. From the application of Receiver Operating Characteristic curve to both risk stratification scales, the results of the calculated areas were 58,8% to our Hema-Obs risk stratification scale and 38,7% to Galit Sarig risk stratification scale, with a Delong test significancie of p = 0.0006.<br /><strong>Conclusions:</strong> We concluded that Hema-Obs risk stratification scale is an effective support for clinical monitoring of therapeutic strategies.


2020 ◽  
Vol 9 (2) ◽  
pp. 54-59
Author(s):  
Rajesh Pandey ◽  
Rahul Pathak ◽  
Arun Gnawali ◽  
Prem Krishna Khadga ◽  
Sashi Sharma ◽  
...  

Introduction: Non-invasive assessment of esophageal varices (EVs) may reduce endoscopic burden and cost. This study aimed to evaluate the diagnostic accuracy of non-invasive fibrosis scores (AAR, APRI, FIB-4, King and Lok scores) for the prediction of varices in liver cirrhosis. Methods: This prospective study included 100 liver cirrhosis patients who underwent screening endoscopy for EVs. AAR, APRI, FIB-4, King and Lok scores were assessed. The receiver operating characteristic curves (ROC) were plotted to measure and compare the performance of each score for predicting EVs and to obtain the corresponding optimal prediction value. Results: Of the 100 patients, 70 were males and 30 were females with a mean age of 54.05±11.58 years. Esophageal varices were found in 77 patients out of which 58.44% were high-risk varices. Platelet count and non-invasive fibrosis scores APRI, FIB-4, Lok and King were able to discriminate patients with and without varices. Using area under receiver operating characteristic curve (AUROC), these scores were found to have low to moderate diagnostic accuracy for the presence of EVs and high-risk EVs, where the APRI score had the highest AUROC (0.77 and 0.70) respectively. At a cutoff value > 1.4, APRI score had 90.9% sensitivity, 60.9% specificity and 84 % diagnostic accuracy in predicting the presence of varices, while it had 84.4% sensitivity, 45.5% specificity and 63% diagnostic accuracy in predicting the presence of highrisk varices, at a cutoff value > 2.02. Conclusion: APRI, AAR, FIB-4, King, and Lok scores had low to moderate diagnostic accuracy in predicting the presence of varices in liver cirrhosis. The APRI score can help select a patient for the endoscopy but cannot replace endoscopy for esophageal varices screening.


2021 ◽  
Vol 62 (03) ◽  
pp. e180-e192
Author(s):  
Claudio Díaz-Ledezma ◽  
David Díaz-Solís ◽  
Raúl Muñoz-Reyes ◽  
Jonathan Torres Castro

Resumen Introducción La predicción de la estadía hospitalaria luego de una artroplastia total de cadera (ATC) electiva es crucial en la evaluación perioperatoria de los pacientes, con un rol determinante desde el punto de vista operacional y económico. Internacionalmente, se han empleado macrodatos (big data, en inglés) e inteligencia artificial para llevar a cabo evaluaciones pronósticas de este tipo. El objetivo del presente estudio es desarrollar y validar, con el empleo del aprendizaje de máquinas (machine learning, en inglés), una herramienta capaz de predecir la estadía hospitalaria de pacientes chilenos mayores de 65 años sometidos a ATC por artrosis. Material y Métodos Empleando los registros electrónicos de egresos hospitalarios anonimizados del Departamento de Estadísticas e Información de Salud (DEIS), se obtuvieron los datos de 8.970 egresos hospitalarios de pacientes sometidos a ATC por artrosis entre los años 2016 y 2018. En total, 15 variables disponibles en el DEIS, además del porcentaje de pobreza de la comuna de origen del paciente, fueron incluidos para predecir la probabilidad de que un paciente presentara una estadía acortada (< 3 días) o prolongada (> 3 días) luego de la cirugía. Utilizando técnicas de aprendizaje de máquinas, 8 algoritmos de predicción fueron entrenados con el 80% de la muestra. El 20% restante se empleó para validar las capacidades predictivas de los modelos creados a partir de los algoritmos. La métrica de optimización se evaluó y ordenó en un ranking utilizando el área bajo la curva de característica operativa del receptor (area under the receiver operating characteristic curve, AUC-ROC, en inglés), que corresponde a cuan bien un modelo puede distinguir entre dos grupos. Resultados El algoritmo XGBoost obtuvo el mejor desempeño, con una AUC-ROC promedio de 0,86 (desviación estándar [DE]: 0,0087). En segundo lugar, observamos que el algoritmo lineal de máquina de vector de soporte (support vector machine, SVM, en inglés) obtuvo una AUC-ROC de 0,85 (DE: 0,0086). La importancia relativa de las variables explicativas demostró que la región de residencia, el servicio de salud, el establecimiento de salud donde se operó el paciente, y la modalidad de atención son las variables que más determinan el tiempo de estadía de un paciente. Discusión El presente estudio desarrolló algoritmos de aprendizaje de máquinas basados en macrodatos chilenos de libre acceso, y logró desarrollar y validar una herramienta que demuestra una adecuada capacidad discriminatoria para predecir la probabilidad de estadía hospitalaria acortada versus prolongada en adultos mayores sometidos a ATC por artrosis. Conclusión Los algoritmos creados a traves del empleo del aprendizaje de máquinas permiten predecir la estadía hospitalaria en pacientes chilenos operado de artroplastia total de cadera electiva.


2022 ◽  
Vol 12 ◽  
Author(s):  
Olivier Beauchet ◽  
Liam A. Cooper-Brown ◽  
Joshua Lubov ◽  
Gilles Allali ◽  
Marc Afilalo ◽  
...  

Purpose: The Emergency Room Evaluation and Recommendation (ER2) is an application in the electronic medical file of patients visiting the Emergency Department (ED) of the Jewish General Hospital (JGH; Montreal, Quebec, Canada). It screens for older ED visitors at high risk of undesirable events. The aim of this study is to examine the performance criteria (i.e., sensitivity, specificity, positive predictive value [PPV], negative predictive value [NPV], positive likelihood ratio [LR+], negative likelihood ratio [LR-] and area under the receiver operating characteristic curve [AUROC]) of the ER2 high-risk level and its “temporal disorientation” item alone to screen for major neurocognitive disorders in older ED visitors at the JGH.Methods: Based on a cross-sectional design, 999 older adults (age 84.9 ± 5.6, 65.1% female) visiting the ED of the JGH were selected from the ER2 database. ER2 was completed upon the patients' arrival at the ED. The outcomes were ER2's high-risk level, the answer to ER2's temporal disorientation item (present vs. absent), and the diagnosis of major neurocognitive disorders (yes vs. no) which was confirmed when it was present in a letter or other files signed by a physician.Results: The sensitivities of both ER2's high-risk level and temporal disorientation item were high (≥0.91). Specificity, the PPV, LR+, and AROC were higher for the temporal disorientation item compared to ER2's high-risk level, whereas a highest sensitivity, LR-, and NPV were obtained with the ER2 high-risk level. Both area under the receiver operating characteristic curves were high (0.71 for ER2's high-risk level and 0.82 for ER2 temporal disorientation item). The odds ratios (OR) of ER2's high-risk level and of temporal disorientation item for the diagnosis of major neurocognitive disorders were positive and significant with all OR above 18, the highest OR being reported for the temporal disorientation item in the unadjusted model [OR = 26.4 with 95% confidence interval (CI) = 17.7–39.3].Conclusion: Our results suggest that ER2 and especially its temporal disorientation item may be used to screen for major neurocognitive disorders in older ED users.


2014 ◽  
Vol 120 (5) ◽  
pp. 1168-1181 ◽  
Author(s):  
Daryl J. Kor ◽  
Ravi K. Lingineni ◽  
Ognjen Gajic ◽  
Pauline K. Park ◽  
James M. Blum ◽  
...  

Abstract Background: Acute respiratory distress syndrome (ARDS) remains a serious postoperative complication. Although ARDS prevention is a priority, the inability to identify patients at risk for ARDS remains a barrier to progress. The authors tested and refined the previously reported surgical lung injury prediction (SLIP) model in a multicenter cohort of at-risk surgical patients. Methods: This is a secondary analysis of a multicenter, prospective cohort investigation evaluating high-risk patients undergoing surgery. Preoperative ARDS risk factors and risk modifiers were evaluated for inclusion in a parsimonious risk-prediction model. Multiple imputation and domain analysis were used to facilitate development of a refined model, designated SLIP-2. Area under the receiver operating characteristic curve and the Hosmer–Lemeshow goodness-of-fit test were used to assess model performance. Results: Among 1,562 at-risk patients, ARDS developed in 117 (7.5%). Nine independent predictors of ARDS were identified: sepsis, high-risk aortic vascular surgery, high-risk cardiac surgery, emergency surgery, cirrhosis, admission location other than home, increased respiratory rate (20 to 29 and ≥30 breaths/min), Fio2 greater than 35%, and Spo2 less than 95%. The original SLIP score performed poorly in this heterogeneous cohort with baseline risk factors for ARDS (area under the receiver operating characteristic curve [95% CI], 0.56 [0.50 to 0.62]). In contrast, SLIP-2 score performed well (area under the receiver operating characteristic curve [95% CI], 0.84 [0.81 to 0.88]). Internal validation indicated similar discrimination, with an area under the receiver operating characteristic curve of 0.84. Conclusions: In this multicenter cohort of patients at risk for ARDS, the SLIP-2 score outperformed the original SLIP score. If validated in an independent sample, this tool may help identify surgical patients at high risk for ARDS.


2020 ◽  
Vol 12 (1) ◽  
Author(s):  
Sophie Lemmens ◽  
Toon Van Craenendonck ◽  
Jan Van Eijgen ◽  
Lies De Groef ◽  
Rose Bruffaerts ◽  
...  

Abstract Introduction The eye offers potential for the diagnosis of Alzheimer’s disease (AD) with retinal imaging techniques being explored to quantify amyloid accumulation and aspects of neurodegeneration. To assess these changes, this proof-of-concept study combined hyperspectral imaging and optical coherence tomography to build a classification model to differentiate between AD patients and controls. Methods In a memory clinic setting, patients with a diagnosis of clinically probable AD (n = 10) or biomarker-proven AD (n = 7) and controls (n = 22) underwent non-invasive retinal imaging with an easy-to-use hyperspectral snapshot camera that collects information from 16 spectral bands (460–620 nm, 10-nm bandwidth) in one capture. The individuals were also imaged using optical coherence tomography for assessing retinal nerve fiber layer thickness (RNFL). Dedicated image preprocessing analysis was followed by machine learning to discriminate between both groups. Results Hyperspectral data and retinal nerve fiber layer thickness data were used in a linear discriminant classification model to discriminate between AD patients and controls. Nested leave-one-out cross-validation resulted in a fair accuracy, providing an area under the receiver operating characteristic curve of 0.74 (95% confidence interval [0.60–0.89]). Inner loop results showed that the inclusion of the RNFL features resulted in an improvement of the area under the receiver operating characteristic curve: for the most informative region assessed, the average area under the receiver operating characteristic curve was 0.70 (95% confidence interval [0.55, 0.86]) and 0.79 (95% confidence interval [0.65, 0.93]), respectively. The robust statistics used in this study reduces the risk of overfitting and partly compensates for the limited sample size. Conclusions This study in a memory-clinic-based cohort supports the potential of hyperspectral imaging and suggests an added value of combining retinal imaging modalities. Standardization and longitudinal data on fully amyloid-phenotyped cohorts are required to elucidate the relationship between retinal structure and cognitive function and to evaluate the robustness of the classification model.


2021 ◽  
Vol 6 (1) ◽  
pp. 1-8
Author(s):  
Sunarti Sunarti

Penelitian ini membahas klasifikasi menggunakan algoritma C4.5 untuk penentuan kelayakan pemberian pinjaman pada koperasi karyawan Aparteman Senayan Jakarta. Permasalahan penentuan kelayakan pemberian pinjaman sangat menggangu perputaran keuangan koperasi tersebut. Metode Algortima C4.5 dipergunakan dalam penetapan pengambilan keputusan penentuan kelayakan pemberian pinjaman pada koperasi karyawan Aparteman Senayan Jakarta, sehingga dapat diketahui level akurasi dan presisinya. Data dianalisis dengan memanfaatkan aplikasi Rapidminer. Rapidminer ialah sebuah software berbasiskan open source. Tujuan penelitian yang dilakukan adalah untuk penentuan kelayakan pemberian pinjaman menggunakan Algoritma C.45 agar dapat memberikan solusi penentuan kelayakan pemberian pinjaman pada koperasi karyawana apartemen Senayan Jakarta. Atribut yang digunakan terdiri dari status pernikahan, jenis kelamin, lokasi kerja, kedisiplinan, status pekerjaan, plafon pinjaman dan status tempat tinggal dengan keputusan layak dan tidak layak. Atribut yang paling berpengaruh terhadap penentuan kelayakan pemberian pinjaman adalah kedisiplinan. Dari hasil pengujian model yang terbentuk memiliki akurasi sebesar 85.19% dengan perolehan hasil pengolahan ROC(Receiver Operating Characteristic) mempergunakan data training sebesar 0.915 level diagnosa excellent classification. Model pohon keputusan yang ada dapat meningkatkan akurasi dalam penentuan kelayakan pemberian pinjaman yang akan diajukan.Kata kunci: Koperasi karyawan, Klasifikasi Pinjaman, Algoritma C4.5, Rapidminer


2020 ◽  
Author(s):  
Min-Tsun Liao ◽  
Chih-Chieh Yu ◽  
Lian-Yu Lin ◽  
Ke-Han Pan ◽  
Tsung-Hsien Tsai ◽  
...  

BACKGROUND Atrial fibrillation (AF) is associated with an increased risk of stroke, heart failure, and all-cause mortality. The electrocardiogram (ECG)–based strategy of screening for AF has some limitations. Photoplethysmography (PPG) is used in AF detection algorithms and allows passive and continuous monitoring by modern wearable devices. OBJECTIVE The objective of this study was to investigate the following: (1) whether quantitatively analyzing wrist PPG waveforms can clearly distinguish AF from sinus rhythm and (2) to determine the appropriate data length of the PPG for feature extraction to optimize the PPG analytics program for AF detection. METHODS Continuous waveforms of ECG through an electrophysiology recording system and PPG signals through a wrist–worn smartwatch were simultaneously collected from patients undergoing catheter ablation or electrical cardioversion for AF. The PPG features (temporal, spectral, or morphological) were extracted from 10, 25, 40, or 80 heartbeats of split segments. Machine learning with a support vector machine (SVM) approach was used for detecting AF. The receiver operating characteristic (ROC) curves were determined to evaluate the diagnostic accuracy. RESULTS A total of 116 patients were evaluated. The mean age was 59.6±11.4 years and 32.8% were women. We collected and annotated more than 117 hours of PPG waveforms. A total of 6478 and 3957 segments of 25-beat pulse-to-pulse interval (PPI) were annotated as AF and sinus rhythm, respectively. A total of eight features were extracted to distinguish AF, including the PPI standard deviation (SD), PPI root-mean-squared standard deviation (RMSSD), Shannon Entropy with bin = 10, 100, 1000 (SE10, SE100, SE1000), moving average of 3 PPI SD, moving SD of 3 PPI RMSSD, and moving SD of maximum FFT frequency in 3 PPI. The accuracy of all the eight PPG features extracted from the 25 PPI achieved a test AUC (area under the receiver operating characteristic curve) which was significantly better than that from the 10 PPI (the AUC was 0.9676 versus 0.9453, respectively; P<2.2e-16). CONCLUSIONS This study demonstrated that quantitatively analyzing PPG waveforms can clearly discriminate the signals of AF from those of sinus rhythm. The appropriate data length of the PPG to optimize the PPG analytics program was 25 heartbeats. CLINICALTRIAL N/A


Sign in / Sign up

Export Citation Format

Share Document