Confidence intervals for the receiver operating characteristic area in studies with small samples

1998 ◽  
Vol 5 (8) ◽  
pp. 561-571 ◽  
Author(s):  
Nancy A. Obuchowski ◽  
Michael L. Lieber
Author(s):  
Jeffrey S Hyams ◽  
Michael Brimacombe ◽  
Yael Haberman ◽  
Thomas Walters ◽  
Greg Gibson ◽  
...  

Abstract Background Develop a clinical and biological predictive model for colectomy risk in children newly diagnosed with ulcerative colitis (UC). Methods This was a multicenter inception cohort study of children (ages 4-17 years) newly diagnosed with UC treated with standardized initial regimens of mesalamine or corticosteroids (CS) depending upon initial disease severity. Therapy escalation to immunomodulators or infliximab was based on predetermined criteria. Patients were phenotyped by clinical activity per the Pediatric Ulcerative Colitis Activity Index (PUCAI), disease extent, endoscopic/histologic severity, and laboratory markers. In addition, RNA sequencing defined pretreatment rectal gene expression and high density DNA genotyping by the Affymetrix UK Biobank Axiom Array. Coprimary outcomes were colectomy over 3 years and time to colectomy. Generalized linear models, Cox proportional hazards multivariate regression modeling, and Kaplan-Meier plots were used. Results Four hundred twenty-eight patients (mean age 13 years) started initial theapy with mesalamine (n = 136), oral CS (n = 144), or intravenous CS (n = 148). Twenty-five (6%) underwent colectomy at ≤1 year, 33 (9%) at ≤2 years, and 35 (13%) at ≤3 years. Further, 32/35 patients who had colectomy failed infliximab. An initial PUCAI ≥ 65 was highly associated with colectomy (P = 0.0001). A logistic regression model predicting colectomy using the PUCAI, hemoglobin, and erythrocyte sedimentation rate had a receiver operating characteristic area under the curve of 0.78 (95% confidence interval [0.73, 0.84]). Addition of a pretreatment rectal gene expression panel reflecting activation of the innate immune system and response to external stimuli and bacteria to the clinical model improved the receiver operating characteristic area under the curve to 0.87 (95% confidence interval [0.82, 0.91]). Conclusions A small group of children newly diagnosed with severe UC still require colectomy despite current therapies. Our gene signature observations suggest additional targets for management of those patients not responding to current medical therapies.


Rheumatology ◽  
2020 ◽  
Author(s):  
Kelvin Y C Yu ◽  
Susan Yung ◽  
Mel K M Chau ◽  
Colin S O Tang ◽  
Desmond Y H Yap ◽  
...  

Abstract Objectives We investigated circulating syndecan-1, HA and thrombomodulin levels in patients with biopsy-proven Class III/IV ± V LN and their clinico-pathological associations. Patients with non-renal SLE or non-lupus chronic kidney disease, and healthy subjects served as controls. Methods Serum syndecan-1, HA and thrombomodulin levels were determined by ELISAs. Results Syndecan-1, HA and thrombomodulin levels were significantly higher during active LN compared with remission (P < 0.01, for all), and correlated with the level of proteinuria, estimated glomerular filtration rate, anti-dsDNA antibodies, complement 3 and serum creatinine. Longitudinal studies showed that syndecan-1 and thrombomodulin levels increased prior to clinical renal flare by 3.6 months, while HA level increased at the time of nephritic flare, and the levels decreased in parallel with treatment response. Receiver operating characteristic curve analysis showed that syndecan-1 and thrombomodulin levels distinguished patients with active LN from healthy subjects, LN patients in remission, patients with active non-renal lupus and patients with non-lupus chronic kidney disease (receiver operating characteristic area under curve of 0.98, 0.91, 0.82 and 0.95, respectively, for syndecan-1; and area under curve of 1.00, 0.84, 0.97 and 0.79, respectively, for thrombomodulin). HA level distinguished active LN from healthy subjects, LN patients in remission and non-lupus chronic kidney disease (receiver operating characteristic area under curve of 0.82, 0.71 and 0.90, respectively) but did not distinguish between renal vs non-renal lupus. Syndecan-1 and thrombomodulin levels correlated with the severity of interstitial inflammation, while HA level correlated with chronicity grading in kidney biopsies of active LN. Conclusion Our findings suggest potential utility of serum syndecan-1, thrombomodulin and HA levels in clinical management, and their potential contribution to LN pathogenesis.


Author(s):  
Mario A. Cleves

The area under the receiver operating characteristic (ROC) curve is often used to summarize and compare the discriminatory accuracy of a diagnostic test or modality, and to evaluate the predictive power of statistical models for binary outcomes. Parametric maximum likelihood methods for fitting of the ROC curve provide direct estimates of the area under the ROC curve and its variance. Nonparametric methods, on the other hand, provide estimates of the area under the ROC curve, but do not directly estimate its variance. Three algorithms for computing the variance for the area under the nonparametric ROC curve are commonly used, although ambiguity exists about their behavior under diverse study conditions. Using simulated data, we found similar asymptotic performance between these algorithms when the diagnostic test produces results on a continuous scale, but found notable differences in small samples, and when the diagnostic test yields results on a discrete diagnostic scale.


2020 ◽  
Author(s):  
Brian J. Park ◽  
Vlasios S. Sotirchos ◽  
Jason Adleberg ◽  
S. William Stavropoulos ◽  
Tessa S. Cook ◽  
...  

AbstractPurposeThis study assesses the feasibility of deep learning detection and classification of 3 retrievable inferior vena cava filters with similar radiographic appearances and emphasizes the importance of visualization methods to confirm proper detection and classification.Materials and MethodsThe fast.ai library with ResNet-34 architecture was used to train a deep learning classification model. A total of 442 fluoroscopic images (N=144 patients) from inferior vena cava filter placement or removal were collected. Following image preprocessing, the training set included 382 images (110 Celect, 149 Denali, 123 Günther Tulip), of which 80% were used for training and 20% for validation. Data augmentation was performed for regularization. A random test set of 60 images (20 images of each filter type), not included in the training or validation set, was used for evaluation. Total accuracy and receiver operating characteristic area under the curve were used to evaluate performance. Feature heatmaps were visualized using guided backpropagation and gradient-weighted class activation mapping.ResultsThe overall accuracy was 80.2% with mean receiver operating characteristic area under the curve of 0.96 for the validation set (N=76), and 85.0% with mean receiver operating characteristic area under the curve of 0.94 for the test set (N=60). Two visualization methods were used to assess correct filter detection and classification.ConclusionsA deep learning model can be used to automatically detect and accurately classify inferior vena cava filters on radiographic images. Visualization techniques should be utilized to ensure deep learning models function as intended.


2017 ◽  
Vol 27 (3) ◽  
pp. 675-688 ◽  
Author(s):  
Jingjing Yin ◽  
Christos T Nakas ◽  
Lili Tian ◽  
Benjamin Reiser

This article explores both existing and new methods for the construction of confidence intervals for differences of indices of diagnostic accuracy of competing pairs of biomarkers in three-class classification problems and fills the methodological gaps for both parametric and non-parametric approaches in the receiver operating characteristic surface framework. The most widely used such indices are the volume under the receiver operating characteristic surface and the generalized Youden index. We describe implementation of all methods and offer insight regarding the appropriateness of their use through a large simulation study with different distributional and sample size scenarios. Methods are illustrated using data from the Alzheimer's Disease Neuroimaging Initiative study, where assessment of cognitive function naturally results in a three-class classification setting.


2019 ◽  
Vol 31 (1) ◽  
pp. 199
Author(s):  
E. Mellisho ◽  
M. Briones ◽  
F. O. Castro ◽  
L. Rodriguez-Alvarez

Extracellular vesicles (EV) secreted by blastocysts might be relevant to predict competence of embryos produced in vitro. The aim of this study was to develop a model to select competent embryos that combines blastocyst morphokinetics data and morphological parameters of EV secreted during blastulation (Days 5-7.5). Embryos were cultured in groups up to Day 5; morulae were selected and individually cultured in SOFaa depleted of EV until Day 7.5 after IVF. Embryo competence was determined by in vitro post-hatching development up to Day 11. A retrospective classification of blastocyst and culture media was performed based on blastulation time [early (EB) or late (LB)] and competence at Day 11 [competent (C) or non-competent (NC)]. The EV were isolated from culture media of individual embryos, their properties determined by nanoparticle tracking analysis. The model was based on a binary logistic regression to describe the dichotomous-dependent variable of the blastocyst (C=1 and NC=0). A set of independent variables of blastocyst morphokinetics (blastulation time, blastocyst stage, blastocyst quality and blastocyst diameter at Day 7.5) and EV morphological parameters [mean size (ME), mode size (MO) and particle concentration (CO)] were analysed with multiple regression. The analysis generated the coefficients and their standard errors and significance level of an equation to calculate a probability, where values between 0.5 and 1 predict competent embryos. To verify the predictive power of the algorithm, the following indicators were used: the receiver operating characteristic with the determination of area under the curve, percentage correct predictions, and Omnibus tests. Statistical significance was determined at the P<0.05 level. A rough guide for classifying the accuracy of a predictive model is as follows: 0.9 to 1=excellent, 0.8 to 0.9=good, 0.7 to 0.8=fair, 0.6 to 0.7=poor, 0.5 to 0.6=fail. A total of 254 embryos were used in this study; from them, 73 were classified in C-EB, 68 in NC-EB, 61 in C-LB and 52 in NC-LB. Initially, all independent variables were analysed in model 1; the most significant predictors associated with embryo competence were blastocyst stage, blastocyst quality, blastocyst diameter, ME and CO (P<0.05). In model 2 no significant variables were excluded (blastulation time and MO). The statistical test of predictive power indicates that models 1 and 2 achieved a receiver operating characteristic-area under the curve of 0.853 (95% confidence interval, 0.806-0.9; P<0.001) and correct predictions of 77.2 and 77.6%, respectively. When EV characteristics were excluded and the model considers only variables from the embryo, the receiver operating characteristic-area under the curve value was 0.714 (95% confidence interval, 0.651-0.777; P<0.001) and correct predictions was reduced to 65.4. Model 2 was consider the most appropriate from the practical point of view because it avoids disturbing embryo culture during blastulation. The results indicate that incorporating EV properties increases accuracy of embryo selection, supporting the possibility to improve conventional methods by combining blastocyst morphology and characteristics of EV obtained by nanoparticle tracking analysis. This work was supported by Fondecyt 1170310.


Sign in / Sign up

Export Citation Format

Share Document