Two-stage receiver operating-characteristic curve estimator for cohort studies

2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Susana Díaz-Coto ◽  
Norberto Octavio Corral-Blanco ◽  
Pablo Martínez-Camblor

AbstractThe receiver operating-characteristic (ROC) curve is a graphical statistical tool routinely used for studying the classification accuracy in both, diagnostic and prognosis problems. Given the different nature of these situations, ROC curve estimation has been separately considered for binary (diagnostic) and time-to-event (prognosis) outcomes, even for data coming from the same study design. In this work, the authors propose a two-stage ROC curve estimator which allows to link both contexts through a general prediction model (first-stage) and the empirical cumulative estimator of the distribution function (second-stage) of the considered test (marker) on the total population. The so-called two-stage Mixed-Subject (sMS) approach proves its behavior on both, large-samples (theoretically) and finite-samples (via Monte Carlo simulations). Besides, a useful asymptotic distribution for the concomitant area under the curve is also computed. Results show the ability of the proposed estimator to fit non-standard situations by considering flexible predictive models. Two real-world examples, one with binary and one with time-dependent outcomes, help us to a better understanding of the proposed methodology on usual practical circumstances. The R code used for the practical implementation of the proposed methodology and its documentation is provided as supplementary material.

2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Pablo Martínez-Camblor ◽  
Sonia Pérez-Fernández ◽  
Susana Díaz-Coto

Abstract The receiver operating-characteristic (ROC) curve is a well-known graphical tool routinely used for evaluating the discriminatory ability of continuous markers, referring to a binary characteristic. The area under the curve (AUC) has been proposed as a summarized accuracy index. Higher values of the marker are usually associated with higher probabilities of having the characteristic under study. However, there are other situations where both, higher and lower marker scores, are associated with a positive result. The generalized ROC (gROC) curve has been proposed as a proper extension of the ROC curve to fit these situations. Of course, the corresponding area under the gROC curve, gAUC, has also been introduced as a global measure of the classification capacity. In this paper, we study in deep the gAUC properties. The weak convergence of its empirical estimator is provided while deriving an explicit and useful expression for the asymptotic variance. We also obtain the expression for the asymptotic covariance of related gAUCs and propose a non-parametric procedure to compare them. The finite-samples behavior is studied through Monte Carlo simulations under different scenarios, presenting a real-world problem in order to illustrate its practical application. The R code functions implementing the procedures are provided as Supplementary Material.


2019 ◽  
Vol 34 (3) ◽  
pp. 302-308 ◽  
Author(s):  
Xiqi Peng ◽  
Xiang Pan ◽  
Kaihao Liu ◽  
Chunduo Zhang ◽  
Liwen Zhao ◽  
...  

Background: miR-142-3p has proved to be involved in tumorigenesis and the development of renal cell carcinoma. The present study aimed to explore the prognostic value of miR-142-3p. Methods: Total RNA was extracted from renal cell carcinoma specimens and the expression level of miR-142-3p was measured. Pearson Chi-square test, Kaplan–Meier analysis, as well as univariate and multivariate regression analysis were performed to determine the correlation between miR-142-3p and the prognosis of renal cell carcinoma patients. Receiver operating characteristic curves were constructed to evaluate the predictive efficiency of miR-142-3p for the prognosis of renal cell carcinoma patients. Data from The Cancer Genome Atlas (TCGA) were utilized to validate our findings. Results: Our results demonstrated that upregulation of miR-142-3p was correlated with shorter overall survival (P=0.002) and was, in the meantime, an independent prognostic factor for renal cell carcinoma patients (P=0.002). The receiver operating characteristic curve combining miR-142-3p expression with tumor stage showed an area under the curve of 0.633 (95% confidence interval 0.563, 0.702). The result of TCGA data was consistent with our findings. Conclusions: Our results suggest miR-142-3p expression is correlated with poor prognosis of renal cell carcinoma patients and may serve as a prognostic biomarker in the future.


Author(s):  
Mario A. Cleves

The area under the receiver operating characteristic (ROC) curve is often used to summarize and compare the discriminatory accuracy of a diagnostic test or modality, and to evaluate the predictive power of statistical models for binary outcomes. Parametric maximum likelihood methods for fitting of the ROC curve provide direct estimates of the area under the ROC curve and its variance. Nonparametric methods, on the other hand, provide estimates of the area under the ROC curve, but do not directly estimate its variance. Three algorithms for computing the variance for the area under the nonparametric ROC curve are commonly used, although ambiguity exists about their behavior under diverse study conditions. Using simulated data, we found similar asymptotic performance between these algorithms when the diagnostic test produces results on a continuous scale, but found notable differences in small samples, and when the diagnostic test yields results on a discrete diagnostic scale.


2016 ◽  
Vol 27 (8) ◽  
pp. 2264-2278 ◽  
Author(s):  
Liang Li ◽  
Tom Greene ◽  
Bo Hu

The time-dependent receiver operating characteristic curve is often used to study the diagnostic accuracy of a single continuous biomarker, measured at baseline, on the onset of a disease condition when the disease onset may occur at different times during the follow-up and hence may be right censored. Due to right censoring, the true disease onset status prior to the pre-specified time horizon may be unknown for some patients, which causes difficulty in calculating the time-dependent sensitivity and specificity. We propose to estimate the time-dependent sensitivity and specificity by weighting the censored data by the conditional probability of disease onset prior to the time horizon given the biomarker, the observed time to event, and the censoring indicator, with the weights calculated nonparametrically through a kernel regression on time to event. With this nonparametric weighting adjustment, we derive a novel, closed-form formula to calculate the area under the time-dependent receiver operating characteristic curve. We demonstrate through numerical study and theoretical arguments that the proposed method is insensitive to misspecification of the kernel bandwidth, produces unbiased and efficient estimators of time-dependent sensitivity and specificity, the area under the curve, and other estimands from the receiver operating characteristic curve, and outperforms several other published methods currently implemented in R packages.


2015 ◽  
Author(s):  
Ειρήνη Τερζή

Μελετήθηκε η συμβολή της άλφα1-μικροσφαιρίνης (alpha1-microglobulin, α1M) - ενός μέλους της οικογένειας των λιποκαλινών, που αποτελεί δείκτη εγγύς νεφροσωληναριακής δυσλειτουργίας - στην πρώιμη διαγνωστική της σχετιζόμενης με την σήψη οξείας νεφρικής βλάβης (acute kidney injury, AKI). Η μελέτη επικεντρώθηκε σε βαρέως πάσχοντες ασθενείς μιας πολυδύναμης Μονάδας Εντατικής Θεραπείας (Μ.Ε.Θ.). Από την προοπτική παρακολούθηση 290 ασθενών που εισήχθησαν για νοσηλεία σε διάστημα ενός έτους, μελετήθηκαν 45 σηπτικοί ασθενείς, εκ των οποίων οι 16 (35.6%) εκδήλωσαν νεφρική ανεπάρκεια. Η α1Μ προσδιορίσθηκε σε δείγματα ούρων από συλλογές ούρων 24ώρου κατά το σηπτικό επεισόδιο και σε συγκεκριμένα χρονικά διαστήματα έκτοτε. Η διαγνωστική ικανότητα του βιοδείκτη εκτιμήθηκε με τον μη παραμετρικό υπολογισμό της περιοχής κάτω από την καμπύλη μίας καμπύλης λειτουργικού χαρακτηριστικού δέκτη (area under the curve (AUC) of the receiver operating characteristic (ROC) curve, AUCROC). Τα επίπεδα της α1Μ ήταν σημαντικά υψηλότερα σε όλους τους σηπτικούς ασθενείς (μέση τιμή επιπέδων σε όλα τα δείγματα στο σηπτικό επεισόδιο 46.02 ± 7.17 mg/l) και παρουσίασαν αυξητική τάση στους ασθενείς που τελικά ανέπτυξαν σηπτική νεφρική ανεπάρκεια. Η AUCROC για την πρόβλεψη της σηπτικής ΑKΙ σύμφωνα με τα επίπεδα της α1M 24 ώρες πριν την εμφάνιση της νεφρικής προσβολής ήταν 0.739 (ευαισθησία 87.5%, ειδικότητα 62.07%, τιμή-όριο 47.9 mg/l). Τα επίπεδα της α1Μ 24 ώρες πριν την σηπτική νεφρική προσβολή, η κρεατινίνη ορού και η βαθμολογία βαρύτητας νόσου κατά APACHE II στο επεισόδιο της σήψης, αναδείχθηκαν ως οι σημαντικότεροι ανεξάρτητοι προγνωστικοί παράγοντες πρόβλεψης της ΑΚΙ. Ο συνδυασμός των ανωτέρω τριών παραμέτρων βελτίωσε την AUCROC της πρόγνωση της AKI σε 0.944. Τα αποτελέσματα υποστηρίζουν την ιδέα πως τα επίπεδα της α1Μ στα ούρα θα μπορούσαν να συμβάλουν στην πρώιμη διάκριση των σηπτικών ασθενών που εξελίσσονται σε ΑΚΙ και μπορεί να αποδειχθούν χρήσιμος βιοδείκτης. Παράλληλα, αναδεικνύουν ως θέμα για περαιτέρω έρευνα την παθογενετική εμπλοκή της α1Μ στην σήψη και στην σηπτική ΑΚΙ.


2011 ◽  
Vol 42 (5) ◽  
pp. 895-898 ◽  
Author(s):  
G. Szmukler ◽  
B. Everitt ◽  
M. Leese

Risk assessment is now regarded as a necessary competence in psychiatry. The area under the curve (AUC) statistic of the receiver operating characteristic curve is increasingly offered as the main evidence for accuracy of risk assessment instruments. But, even a highly statistically significant AUC is of limited value in clinical practice.


Entropy ◽  
2020 ◽  
Vol 22 (6) ◽  
pp. 593 ◽  
Author(s):  
Gareth Hughes

The predictive receiver operating characteristic (PROC) curve is a diagrammatic format with application in the statistical evaluation of probabilistic disease forecasts. The PROC curve differs from the more well-known receiver operating characteristic (ROC) curve in that it provides a basis for evaluation using metrics defined conditionally on the outcome of the forecast rather than metrics defined conditionally on the actual disease status. Starting from the binormal ROC curve formulation, an overview of some previously published binormal PROC curves is presented in order to place the PROC curve in the context of other methods used in statistical evaluation of probabilistic disease forecasts based on the analysis of predictive values; in particular, the index of separation (PSEP) and the leaf plot. An information theoretic perspective on evaluation is also outlined. Five straightforward recommendations are made with a view to aiding understanding and interpretation of the sometimes-complex patterns generated by PROC curve analysis. The PROC curve and related analyses augment the perspective provided by traditional ROC curve analysis. Here, the binormal ROC model provides the exemplar for investigation of the PROC curve, but potential application extends to analysis based on other distributional models as well as to empirical analysis.


2018 ◽  
Vol 6 (1) ◽  
pp. 440-447
Author(s):  
Kathare Alfred ◽  
Otieno Argwings ◽  
Kimeli Victor

The use of gold standard procedures in screening may be costly, risky or even unethical. It is, therefore, not admissible for large scale application. In this case, a more acceptable diagnostic predictor is applied to a sample of subjects alongside a gold standard procedure. The performance of the predictor is then evaluated using Receiver Operating Characteristic curve. The area under the curve, then, provides a summative measure of the performance of the predictor. The Receiver Operating Characteristic curve is a trade-off between sensitivity and specificity which in most cases are of different clinical significance. Also, the area under the curve is criticized for lack of coherent interpretation. In this study, we proposed the use of entropy as a summary index measure of uncertainty to compare diagnostic predictors. Noting that a diseased subject who is truly identified with the disease at a lower cut-off will also be identified at a higher cut-off, we substituted time variable in survival analysis for cut-offs in a binary predictor. We then derived the entropy of the functions of diagnostic predictors. Application of the procedure to real data showed that entropy was a strong measure for quantifying the amount of uncertainty engulfed in a set of cut-offs of binary diagnostic predictor.


2020 ◽  
Vol 13 (3) ◽  
pp. 1391
Author(s):  
Jakeline Jesus Silva ◽  
Lucas Prado Osco ◽  
Ana Paula Marques Ramos ◽  
Wesley Barbosa Dourado

O mapeamento da vegetação arbórea em áreas urbanas pode ser realizado por classificação semiautomática ou automática de imagens orbitais ou aéreas. Contudo, esse tipo de tarefa tem um custo computacional dependente da resolução espacial da imagem. Neste estudo é proposto uma abordagem de extração semiautomática de vegetação arbórea em imagens de alta resolução espacial a baixo custo computacional. Trabalhamos com ortofotos de 1m de resolução, disponibilizadas por órgãos gestores públicos. A abordagem proposta aplica um filtro de médias em recortes de imagens, com 500x500 pixels cada. Ao todo utilizamos 90 recortes. Testamos o algoritmo nas seguintes configurações: separadamente nas bandas (azul, verde e vermelho), em imagem colorida (RGB) e em imagem em tons de cinza. Validamos sua performance usando a matriz de confusão e a curva do Receiver Operating Characteristic (ROC), considerando 3.695 pontos distribuídos homogeneamente em todos os recortes de imagens. Comparamos, ainda, a performance do algoritmo com uma classificação supervisionado por pixel (máxima verossimilhança). Obtivemos uma acurácia global de 90,18%, um índice kappa de 0,80 e uma velocidade de processamento de aproximadamente 1 minuto e 30 segundos para o algoritmo proposto em um computador convencional. A curva ROC obteve uma Area Under the Curve (AUC) equivalente a 0,91 para o algoritmo, considerando o resultado de todas as bandas, e um valor de 0,79 para a classificação supervisionada por pixel. Concluímos que nossa abordagem é computacionalmente eficiente para separar as áreas cobertas por vegetação de áreas não cobertas em ambiente urbano. Semiautomatic extraction of arboreal vegetation in urban areas using aerial imagery of high spatial resolution A B S T R A C TMapping of tree vegetation in urban areas can be performed by semi-automatic or automatic classification of orbital or aerial images. However, this type of task has a computational cost dependent on the spatial resolution of the image. This study proposes an approach of semi-automatic tree vegetation extraction in high spatial resolution images at a low computational cost. We work with 1m resolution orthophotos, made available by public management agencies. The proposed approach applies a medium filter on image clippings of 500x500 pixels each. In all, we use 90 clippings. We tested the algorithm in the following configurations: separately in the bands (blue, green and red), color image (RGB) and grayscale image. We validated its performance using the Confusion Matrix and Receiver Operating Characteristic (ROC) curve, considering 3,695 points evenly distributed across all clippings. We also compared the performance of the algorithm with a pixel supervised classification (maximum likelihood). We obtained an overall accuracy of 90.18%, a kappa index of 0.80 and a processing speed of approximately 1 minute and 30 seconds for the proposed algorithm in a conventional computer. The ROC curve obtained an Area Under the Curve (AUC) equivalent to 0.91 for the algorithm, considering the result of all bands, and a value of 0.79 for the supervised pixel classification. We conclude that our approach is computationally efficient for separating areas covered by vegetation from areas not covered in an urban environment.Keywords: digital image processing; image classification; urban environmental planning.


2015 ◽  
Vol 2015 ◽  
pp. 1-6 ◽  
Author(s):  
Akiyoshi Matsugi ◽  
Keisuke Tani ◽  
Yoshiki Tamaru ◽  
Nami Yoshioka ◽  
Akira Yamashita ◽  
...  

Purpose. The aim of this study was to assess whether the home care score (HCS), which was developed by the Ministry of Health and Welfare in Japan in 1992, is useful for the prediction of advisability of home care.Methods. Subjects living at home and in assisted-living facilities were analyzed. Binominal logistic regression analyses, using age, sex, the functional independence measure score, and the HCS, along with receiver operating characteristic curve analyses, were conducted.Findings/Conclusions. Only HCS was selected for the regression equation. Receiver operating characteristic curve analysis revealed that the area under the curve (0.9), sensitivity (0.82), specificity (0.83), and positive predictive value (0.84) for HCS were higher than those for the functional independence measure, indicating that the HCS is a powerful predictor for advisability of home care.Clinical Relevance. Comprehensive measurements of the condition of provided care and the activities of daily living of the subjects, which are included in the HCS, are required for the prediction of advisability of home care.


Sign in / Sign up

Export Citation Format

Share Document