Considerations on the region of interest in the ROC space

2021 ◽  
pp. 096228022110605
Author(s):  
Luigi Lavazza ◽  
Sandro Morasca

Receiver Operating Characteristic curves have been widely used to represent the performance of diagnostic tests. The corresponding area under the curve, widely used to evaluate their performance quantitatively, has been criticized in several respects. Several proposals have been introduced to improve area under the curve by taking into account only specific regions of the Receiver Operating Characteristic space, that is, the plane to which Receiver Operating Characteristic curves belong. For instance, a region of interest can be delimited by setting specific thresholds for the true positive rate or the false positive rate. Different ways of setting the borders of the region of interest may result in completely different, even opposing, evaluations. In this paper, we present a method to define a region of interest in a rigorous and objective way, and compute a partial area under the curve that can be used to evaluate the performance of diagnostic tests. The method was originally conceived in the Software Engineering domain to evaluate the performance of methods that estimate the defectiveness of software modules. We compare this method with previous proposals. Our method allows the definition of regions of interest by setting acceptability thresholds on any kind of performance metric, and not just false positive rate and true positive rate: for instance, the region of interest can be determined by imposing that [Formula: see text] (also known as the Matthews Correlation Coefficient) is above a given threshold. We also show how to delimit the region of interest corresponding to acceptable costs, whenever the individual cost of false positives and false negatives is known. Finally, we demonstrate the effectiveness of the method by applying it to the Wisconsin Breast Cancer Data. We provide Python and R packages supporting the presented method.

PEDIATRICS ◽  
1991 ◽  
Vol 87 (5) ◽  
pp. 670-674 ◽  
Author(s):  
David M. Jaffe ◽  
Gary R. Fleisher

This study was designed to quantify more precisely the accuracy of magnitude of rectal temperature and total white blood cell (WBC) count as indicators of bacteremia in children with an obvious focal bacterial infection. A total of 955 children, aged 3 to 36 months, who had rectal temperature ≥39.0°C and were seeking care at either of two urban pediatric emergency departments had blood drawn for culture; 885 had blood drawn for WBC count. Twenty-seven had bacteremia. Various combinations of temperature and WBC count were selected to construct receiver-operating-characteristic curves by plotting sensitivity vs false-positive rate (1 - specificity). The receiver-operating-characteristic curve of WBC count provided significantly better diagnostic information than the curve for temperature increments above 39.0°C. Each increment of 0.5°C led to large decrements in sensitivity and false-positive rates. At a WBC count cutoff of 10 000/mm3, the sensitivity was 92% while the false-positive rate was 57%. Using this cutoff point, the clinician could have avoided performing 368 of 955 blood cultures and missed only 2 of 26 children with bacteremia. Receiver-operating-characteristic curves combining WBC count and temperature increments above 39.0°C provided no better diagnostic information than that of WBC count at a temperature cutoff of 39.0°C. It is concluded that increments in temperature above 39.0°C provided additional diagnostic specificity for bacteremia only at the expense of unacceptable decreases in sensitivity. Total WBC count provided better information. A WBC count cutoff of 10 000/mm3 increased specificity with minimal decrease in sensitivity. Receiver-operating-characteristic curve analysis allows selection of cutoff criteria by individual practitioners based on the prevalence of bacteremia in their communities and on the perceived risks of bacteremia.


1991 ◽  
Vol 124 (3) ◽  
pp. 295-306 ◽  
Author(s):  
A. D. Genazzani ◽  
D. Rodbard

Abstract. We utilize the "Receiver Operating Characteristic" to describe the relationship between sensitivity and specificity as the threshold for peak detection is varied systematically, to provide objective comparison of the performance of methods for detection of episodic hormonal secretion. A computer program was used to generate synthetic data with peaks with variable durations, with constant or variable height, shape and/or interpulse interval. This approach was used to compare the CLUSTER and DETECT programs. For both programs, the observed false positive rates estimated using signal-free data were in good agreement with the nominal rates, but in the presence of signal the observed false positive rates were systematically lower. Sensitivity increases with increasing signal/noise ratio, as expected. Program DETECT, using its standard options, provided excellent sensitivity (90-100%) with very low false positive rate under all conditions tested. Its performance could be further improved by the use of a more stringent definition of a peak requiring the presence of "UP" followed by a "DOWN". The CLUSTER program was found to have very poor sensitivity when using the "local variance" option. Use of the true fixed standard deviation or percent coefficient of variation resulted in a modest improvement. Optimal performance of program CLUSTER was obtained by the use of the best of 3 variance models, testing 12 different cluster sizes (from 1×1) to 4×4 and selecting the best among these: under these conditions it can achieve high sensitivity (90-100%) for very low observed false positive rate, such that its performance was comparable to that of DETECT. The methods developed and illustrated here should permit the definitive characterization and validation of the performance of any one method, the objective comparison of the relative performance of two or more methods for analysis of pulsatile hormone levels for episodic hormone secretion, and lead to the improvement of algorithms for peak detection.


2011 ◽  
Vol 42 (5) ◽  
pp. 895-898 ◽  
Author(s):  
G. Szmukler ◽  
B. Everitt ◽  
M. Leese

Risk assessment is now regarded as a necessary competence in psychiatry. The area under the curve (AUC) statistic of the receiver operating characteristic curve is increasingly offered as the main evidence for accuracy of risk assessment instruments. But, even a highly statistically significant AUC is of limited value in clinical practice.


1981 ◽  
Vol 27 (9) ◽  
pp. 1569-1574 ◽  
Author(s):  
E A Robertson ◽  
M H Zweig

Abstract The usefulness of an analytical system in patient care is ultimately judged not by its analytical performance but by its clinical performance, i.e., its ability to separate apparently similar patients into two subgroups, one of which has a particular clinically important condition and another subgroup which does not. This clinical performance can be studied with the tools of signal detectability theory, originally developed to analyze the performance of radar and data-transmission systems. Each classification made by an analytical system may be categorized as a true-positive, true-negative, false-positive, or false-negative decision. For laboratory tests the proportion of decisions in each category depends on the biological overlap between the two subgroups, the analytical performance of the system, and the decision level chosen. The clinical performance of the analytical system for all possible decision levels is represented by the receiver operating characteristic curve, which plots the true-positive rate against the false-positive rate. The use of these curves permits comparison of alternative analytical techniques at equal true-positive rates and at all possible decision levels. These comparisons show the effect of analytical improvements on clinical performance.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1894
Author(s):  
Chun Guo ◽  
Zihua Song ◽  
Yuan Ping ◽  
Guowei Shen ◽  
Yuhei Cui ◽  
...  

Remote Access Trojan (RAT) is one of the most terrible security threats that organizations face today. At present, two major RAT detection methods are host-based and network-based detection methods. To complement one another’s strengths, this article proposes a phased RATs detection method by combining double-side features (PRATD). In PRATD, both host-side and network-side features are combined to build detection models, which is conducive to distinguishing the RATs from benign programs because that the RATs not only generate traffic on the network but also leave traces on the host at run time. Besides, PRATD trains two different detection models for the two runtime states of RATs for improving the True Positive Rate (TPR). The experiments on the network and host records collected from five kinds of benign programs and 20 famous RATs show that PRATD can effectively detect RATs, it can achieve a TPR as high as 93.609% with a False Positive Rate (FPR) as low as 0.407% for the known RATs, a TPR 81.928% and FPR 0.185% for the unknown RATs, which suggests it is a competitive candidate for RAT detection.


2021 ◽  
pp. 103985622110286
Author(s):  
Tracey Wade ◽  
Jamie-Lee Pennesi ◽  
Yuan Zhou

Objective: Currently eligibility for expanded Medicare items for eating disorders (excluding anorexia nervosa) require a score ⩾ 3 on the 22-item Eating Disorder Examination-Questionnaire (EDE-Q). We compared these EDE-Q “cases” with continuous scores on a validated 7-item version of the EDE-Q (EDE-Q7) to identify an EDE-Q7 cut-off commensurate to 3 on the EDE-Q. Methods: We utilised EDE-Q scores of female university students ( N = 337) at risk of developing an eating disorder. We used a receiver operating characteristic (ROC) curve to assess the relationship between the true-positive rate (sensitivity) and the false-positive rate (1-specificity) of cases ⩾ 3. Results: The area under the curve showed outstanding discrimination of 0.94 (95% CI: .92–.97). We examined two specific cut-off points on the EDE-Q7, which included 100% and 87% of true cases, respectively. Conclusion: Given the EDE-Q cut-off for Medicare is used in conjunction with other criteria, we suggest using the more permissive EDE-Q7 cut-off (⩾2.5) to replace use of the EDE-Q cut-off (⩾3) in eligibility assessments.


2021 ◽  
pp. 096228022199595
Author(s):  
Yalda Zarnegarnia ◽  
Shari Messinger

Receiver operating characteristic curves are widely used in medical research to illustrate biomarker performance in binary classification, particularly with respect to disease or health status. Study designs that include related subjects, such as siblings, usually have common environmental or genetic factors giving rise to correlated biomarker data. The design could be used to improve detection of biomarkers informative of increased risk, allowing initiation of treatment to stop or slow disease progression. Available methods for receiver operating characteristic construction do not take advantage of correlation inherent in this design to improve biomarker performance. This paper will briefly review some developed methods for receiver operating characteristic curve estimation in settings with correlated data from case–control designs and will discuss the limitations of current methods for analyzing correlated familial paired data. An alternative approach using conditional receiver operating characteristic curves will be demonstrated. The proposed approach will use information about correlation among biomarker values, producing conditional receiver operating characteristic curves that evaluate the ability of a biomarker to discriminate between affected and unaffected subjects in a familial paired design.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Bianca M. Leca ◽  
Maria Mytilinaiou ◽  
Marina Tsoli ◽  
Andreea Epure ◽  
Simon J. B. Aylwin ◽  
...  

AbstractProlactinomas represent the most common type of secretory pituitary neoplasms, with a therapeutic management that varies considerably based on tumour size and degree of hyperprolactinemia. The aim of the current study was to evaluate the relationship between serum prolactin (PRL) concentrations and prolactinoma size, and to determine a cut-off PRL value that could differentiate micro- from macro-prolactinomas. A retrospective cohort study of 114 patients diagnosed with prolactinomas between 2007 and 2017 was conducted. All patients underwent gadolinium enhanced pituitary MRI and receiver operating characteristic (ROC) analyses were performed. 51.8% of patients in this study were men, with a mean age at the time of diagnosis of 42.32 ± 15.04 years. 48.2% of the total cohort were found to have microadenomas. Baseline serum PRL concentrations were strongly correlated to tumour dimension (r = 0.750, p = 0.001). When performing the ROC curve analysis, the area under the curve was 0.976, indicating an excellent accuracy of the diagnostic method. For a value of 204 μg/L (4338 mU/L), sensitivity and specificity were calculated at 0.932 and 0.891, respectively. When a cut off value of 204 μg/L (4338 mU/L) was used, specificity was 93.2%, and sensitivity 89.1%, acceptable to reliably differentiate between micro- and macro- adenomas.


Sign in / Sign up

Export Citation Format

Share Document