holdout validation
Recently Published Documents


TOTAL DOCUMENTS

15
(FIVE YEARS 13)

H-INDEX

3
(FIVE YEARS 2)

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shuo Feng ◽  
Joel A. Dubin

AbstractAPACHE IVa provides typically useful and accurate predictions on in-hospital mortality and length of stay for patients in critical care. However, there are factors which may preclude APACHE IVa from reaching its ceiling of predictive accuracy. Our primary aim was to determine which variables available within the first 24 h of a patient’s ICU stay may be indicative of the APACHE IVa scoring system making occasional but potentially illuminating errors in predicting in-hospital mortality. We utilized the publicly available multi-institutional ICU database, eICU, available since 2018, to identify a large observational cohort for our investigation. APACHE IVa scores are provided by eICU for each patient’s ICU stay. We used Lasso logistic regression in an aim to build parsimonious final models, using cross-validation to select the penalization parameter, separately for each of our two responses, i.e., errors, of interest, which are APACHE falsely predicting in-hospital death (Type I error), and APACHE falsely predicting in-hospital survival (Type II error). We then assessed the performance of the models with a random holdout validation sample. While the extremeness of the APACHE prediction led to dependable predictions for preventing either type of error, distinct variables were identified as being strongly associated with the two different types of errors occurring. These included a primary set of predictors consisting of mean SpO2 and worst lactate for predicting Type I errors, and worst albumin and mean heart rate for Type II. In addition, a secondary set of predictors including changes recorded in care limitations for the patient’s treatment plan, worst pH, whether cardiac arrest occurred at admission, and whether vasopressor was provided for predicting Type I error; age, whether the patient was ventilated in day 1, mean respiratory rate, worst lactate, worst blood urea nitrogen test, and mean aperiodic vitals for Type II. The two models also differed in their performance metrics in their holdout validation samples, in large part due to the lower prevalence of Type II errors compared to Type I. The eICU database was a good resource for evaluating our objective, and important recommendations are provided, particularly identifying key variables that could lead to APACHE prediction errors when APACHE scores are sufficiently low to predict in-hospital survival.


BMJ Open ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. e047528
Author(s):  
Yang Guang ◽  
Wen He ◽  
Bin Ning ◽  
Hongxia Zhang ◽  
Chen Yin ◽  
...  

ObjectivesThe aim of this study was to evaluate the performance of deep learning-based detection and classification of carotid plaque (DL-DCCP) in carotid plaque contrast-enhanced ultrasound (CEUS).Methods and analysisA prospective multicentre study was conducted to assess vulnerability in patients with carotid plaque. Data from 547 potentially eligible patients were prospectively enrolled from 10 hospitals, and 205 patients with CEUS video were finally enrolled for analysis. The area under the receiver operating characteristic curve (AUC) was used to evaluate the effectiveness of DL-DCCP and two experienced radiologists who manually examined the CEUS video (RA-CEUS) in diagnosing and classifying carotid plaque vulnerability. To evaluate the influence of dynamic video input on the performance of the algorithm, a state-of-the-art deep convolutional neural network (CNN) model for static images (Xception) was compared with DL-DCCP for both training and holdout validation cohorts.ResultsThe AUCs of DL-DCCP were significantly better than those of the experienced radiologists for both the training and holdout validation cohorts (training, DL-DCCP vs RA-CEUS, AUC: 0.85 vs 0.69, p<0.01; holdout validation, DL-DCCP vs RA-CEUS, AUC: 0.87 vs 0.66, p<0.01), that is, also better than the best deep CNN model Xception we had performed, for both the training and holdout validation cohorts (training, DL-DCCP vs Xception, AUC:0.85 vs 0.82, p<0.01; holdout validation, DL-DCCP vs Xception, AUC: 0.87 vs 0.77, p<0.01).ConclusionDL-DCCP shows better overall performance in assessing the vulnerability of carotid atherosclerotic plaques than RA-CEUS. Moreover, with a more powerful network structure and better utilisation of video information, DL-DCCP provided greater diagnostic accuracy than a state-of-the-art static CNN model.Trial registration numberChiCTR1900021846,


2021 ◽  
Vol 20 (1) ◽  
pp. 1-12
Author(s):  
Jamaludin Jamaludin ◽  
Chaerur Rozikin ◽  
Agung Susilo Yuda Irawan

Di Indonesia buah mangga merupakan tanaman yang tumbuh subur. Namun pemilihan jenis mangga sendiri masih dilakukan secara manual yakni memilah jenis mangga dengan membanding warna, bentuk dan ukuran. Salah satu perkembangan teknologi pada bidang perindustrian yakni jaringan syaraf tiruan yang mampu belajar sendiri layaknya manusia. Dalam penelitian ini dibuat sebuah sistem yang mampu mengklasifikasi jenis-jenis mangga. Sistem yang akan dibangun ini menerapkan jaringan syaraf tiruan untuk pemodelannya dan menggunakan ekstraksi ciri berupa mean RGB dan standar deviasi RGB, perimeter, luas, panjang, lebar, kebulatan, dan kerampingan. Pada proses percobaan klasifikasi jenis buah mangga digunakan jaringan syaraf tiruan propagasi balik (backpropagation) dengan melakukan variasi 2 model, yakni traingdx dan trainlm dan fungsi transfer layer logsig dan fungsi transfer output purelin. Model pengujian yang digunakan pada proses klasifikasi adalah k-fold cross validation dengan dasar variasi epoch, goal, dan learning rate dari pengujian menggunakan holdout validation. Berdasarkan hasil percobaan, didapat akurasi terbaik dengan 1 hidden layer sebesar 100% dengan waktu 10,45 detik kemudian pengujian k-fold menghasilkan rata-rata akurasi tertinggi 95,31% dengan rata-rata waktu 0,06 detik.


Author(s):  
Dennis H Murphree ◽  
Patrick M Wilson ◽  
Shusaku W Asai ◽  
Daniel J Quest ◽  
Yaxiong Lin ◽  
...  

Abstract Objective Access to palliative care (PC) is important for many patients with uncontrolled symptom burden from serious or complex illness. However, many patients who could benefit from PC do not receive it early enough or at all. We sought to address this problem by building a predictive model into a comprehensive clinical framework with the aims to (i) identify in-hospital patients likely to benefit from a PC consult, and (ii) intervene on such patients by contacting their care team. Materials and Methods Electronic health record data for 68 349 inpatient encounters in 2017 at a large hospital were used to train a model to predict the need for PC consult. This model was published as a web service, connected to institutional data pipelines, and consumed by a downstream display application monitored by the PC team. For those patients that the PC team deems appropriate, a team member then contacts the patient’s corresponding care team. Results Training performance AUC based on a 20% holdout validation set was 0.90. The most influential variables were previous palliative care, hospital unit, Albumin, Troponin, and metastatic cancer. The model has been successfully integrated into the clinical workflow making real-time predictions on hundreds of patients per day. The model had an “in-production” AUC of 0.91. A clinical trial is currently underway to assess the effect on clinical outcomes. Conclusions A machine learning model can effectively predict the need for an inpatient PC consult and has been successfully integrated into practice to refer new patients to PC.


2020 ◽  
Author(s):  
Shuo Feng ◽  
Joel A Dubin

Abstract Background: APACHE IV provides typically useful and accurate predictions on in-hospital mortality and length of stay for patients in critical care. However, there are factors which may preclude APACHE IV from reaching its ceiling of predictive accuracy. Our primary aim was to determine which variables available within the first 24 hours of a patient’s ICU stay may be indicative of the APACHE IV scoring system making occasional but potentially illuminating errors in predicting in-hospital mortality. Methods: We utilized the publicly available multi-institutional ICU database, eICU, available since 2018, to identify a large observational cohort for our investigation. APACHE IV scores are provided by eICU for each patient’s ICU stay. We used Lasso logistic regression in an aim to build parsimonious final models, using cross-validation to select the penalization parameter, separately for each of our two responses, i.e., errors, of interest, which are APACHE falsely predicting in-hospital death (Type I error), and APACHE falsely predicting in-hospital survival (Type II error). We then assessed the performance of the models with a random holdout validation sample. Results: While the extremeness of the APACHE prediction led to dependable predictions for preventing either type of error, there were a small set of distinct variables identified as being strongly associated with the two different types of errors occurring. These included worst lactate and mean SpO2 for Type I, and mean non-invasive blood pressure and mean respiratory rate for Type II. The two models also differed in their performance metrics in identical holdout validation samples, in large part due to the lower prevalence of Type II errors compared to Type I. Conclusions: The eICU database was a good resource for evaluating our objective, and important recommendations to intensivists are provided, particularly identifying key variables that could lead to APACHE prediction errors when APACHE scores are sufficiently low to predict in-hospital survival.


2020 ◽  
Author(s):  
Stephen S.F. Yip ◽  
Zan Klanecek ◽  
Shotaro Naganawa ◽  
John Kim ◽  
Andrej Studen ◽  
...  

Objectives: This study investigated the performance and robustness of radiomics in predicting COVID-19 severity in a large public cohort. Methods: A public dataset of 1110 COVID-19 patients (1 CT/patient) was used. Using CTs and clinical data, each patient was classified into mild, moderate, and severe by two observers: (1) dataset provider and (2) a board-certified radiologist. For each CT, 107 radiomic features were extracted. The dataset was randomly divided into a training (60%) and holdout validation (40%) set. During training, features were selected and combined into a logistic regression model for predicting severe cases from mild and moderate cases. The models were trained and validated on the classifications by both observers. AUC quantified the predictive power of models. To determine model robustness, the trained models was cross-validated on the inter-observer classifications. Results: A single feature alone was sufficient to predict mild from severe COVID-19 with 〖AUC〗_valid^provider=0.85 and 〖AUC〗_valid^radiologist=0.74 (p<<0.01). The most predictive features were the distribution of small size-zones (GLSZM-SmallAreaEmphasis) for provider classification and linear dependency of neighboring voxels (GLCM-Correlation) for radiologist classification. Cross-validation showed that both 〖AUC〗_valid^ ≈0.80 (p<<0.01). In predicting moderate from severe COVID-19, first-order-Median alone had sufficient predictive power of 〖AUC〗_valid^provider=0.65 (p=0.01). For radiologist classification, the predictive power of the model increased to 〖AUC〗_valid^radiologist=0.66 (p<<0.01) as the number of features grew from 1 to 5. Cross-validation yielded 〖AUC〗_valid^radiologist=0.63 (p=0.002) and 〖AUC〗_valid^provider=0.60 (p=0.09). Conclusions: Radiomics significantly predicted different levels of COVID-19 severity. The prediction was moderately sensitive to inter-observer classifications, and thus need to be used with caution.


2020 ◽  
Author(s):  
Zekuan Yu ◽  
Xiaohu Li ◽  
Haitao Sun ◽  
Jian Wang ◽  
Tongtong Zhao ◽  
...  

Abstract Background: To implement the real-time diagnosis of the severity of patients infected with novel coronavirus 2019 (COVID-19) and guide the follow-up therapeutic treatment, We collected chest CT scans of 202 patients diagnosed with the COVID-19 from three hospitals in Anhui Province, China.Methods: A total of 729 2D axial plan slices with 246 severe cases and 483 non-severe cases were employed in this study. Four pre-trained deep models (Inception-V3, ResNet-50, ResNet-101, DenseNet-201) with multiple classifiers (linear discriminant, linear SVM, cubic SVM, KNN and Adaboost decision tree) were applied to identify the severe and non-severe COVID-19 cases. Three validation strategies (holdout validation, 10-fold cross-validation and leave-one-out) are employed to validate the feasibility of proposed pipelines. Results and conclusion: The experimental results demonstrate that classification of the features from pre-trained deep models show the promising application in COVID-19 screening whereas the DenseNet-201 with cubic SVM model achieved the best performance. Specifically, it achieved the highest severity classification accuracy of 95.20% and 95.34% for 10-fold cross-validation and leave-one-out, respectively. The established pipeline was able to achieve a rapid and accurate identification of the severity of COVID-19. This may assist the physicians to make more efficient and reliable decisions.


Author(s):  
Saja Taha Ahmed ◽  
Rafah Al-Hamdani ◽  
Muayad Sadik Croock

<p><span>Recently, the decision trees have been adopted among the preeminent utilized classification models. They acquire their fame from their efficiency in predictive analytics, easy to interpret and implicitly perform feature selection. This latter perspective is one of essential significance in Educational Data Mining (EDM), in which selecting the most relevant features has a major impact on classification accuracy enhancement. <br /> The main contribution is to build a new multi-objective decision tree, which can be used for feature selection and classification. The proposed Decisive Decision Tree (DDT) is introduced and constructed based on a decisive feature value as a feature weight related to the target class label. The traditional Iterative Dichotomizer 3 (ID3) algorithm and the proposed DDT are compared using three datasets in terms of some ID3 issues, including logarithmic calculation complexity and multi-values features<em></em>selection. The results indicated that the proposed DDT outperforms the ID3 in the developing time. The accuracy of the classification is improved on the basis of 10-fold cross-validation for all datasets with the highest accuracy achieved by the proposed method is 92% for the student.por dataset and holdout validation for two datasets, i.e. Iraqi and Student-Math. The experiment also shows that the proposed DDT tends to select attributes that are important rather than multi-value. </span></p>


2020 ◽  
Vol 33 (2) ◽  
pp. 219-232
Author(s):  
Giarno Giarno ◽  
Muhammad Pramono Hadi ◽  
Slamet Suprayogi ◽  
Sigit Heru Murti

Spatial rainfall interpolation requires a number of suitable validation samples to maintain accuracy. Generally, the larger the areas which can be predicted, the better the interpolation. In addition, the data used for validation should be separated from the modelling data. Moreover, the number of samples determine optimally proportion the independent sites. The objective of this study is to determine the optimal sample ratio for holdout validation in interpolation methods; the Makassar Strait was chosen as the study location because of its daily rainfall variation. The accuracy of the sample selection is tested using correlation, root mean square error (RMSE), mean absolute error (MAE) and the indicators of contingency tables. The results show that accuracy depends on the ratio of the modelling data. Therefore, the more extensive the data used for interpolation, the better the accuracy. Otherwise, if the rain gauge data is separated according to province, there will be a variation in accuracy in the portion of independent samples. For rainfall interpolation, it is recommended to use a minimum 75% of data sites to maintain accuracy. Comparison between kriging and inverse distance weighting or IDW methods indicates that IDW is better. Moreover, rainfall characteristics affect the accuracy and portion of the independent sample.


Sign in / Sign up

Export Citation Format

Share Document