An Alternative Approach for Evaluating the Binormal ROC Curve

2013 ◽  
Vol 4 (1) ◽  
pp. 1-17 ◽  
Author(s):  
R. Amala ◽  
R. Vishnu Vardhan

In recent years the ROC curve analysis has got its attention in almost all diversified fields. Basing on the data pattern and its distribution various forms of ROC models have been derived. In this paper, the authors have assumed that the data of two populations (healthy and diseased) follows normal distribution, it is one of the most commonly used forms under parametric approach. The present paper focuses on providing an alternative approach for the tradeoff plot of ROC curve and the computation of AUC using a special function of sigmoid shape called Error function. It is assumed that the test scores of particular biomarker are normally distributed. The entire work has been carried out for providing a new approach for the construction of Binormal ROC curve, which makes use of Error function which can be called as ErROC curve. The summary measure AUC of the resulting ErROC curve has been estimated and defined as ErAUC. The authors have also focused on deriving the expression for obtaining the optimal cut-off point. The new ErROC curve model will provide the true positive rate value at each and every point of false positive rate unlike conventional Binormal ROC model.

2017 ◽  
Vol 28 (1) ◽  
pp. 184-195 ◽  
Author(s):  
Hanfang Yang ◽  
Kun Lu ◽  
Xiang Lyu ◽  
Feifang Hu

Simultaneous control on true positive rate and false positive rate is of significant importance in the performance evaluation of diagnostic tests. Most of the established literature utilizes partial area under receiver operating characteristic (ROC) curve with restrictions only on false positive rate (FPR), called FPR pAUC, as a performance measure. However, its indirect control on true positive rate (TPR) is conceptually and practically misleading. In this paper, a novel and intuitive performance measure, named as two-way pAUC, is proposed, which directly quantifies partial area under ROC curve with explicit restrictions on both TPR and FPR. To estimate two-way pAUC, we devise a nonparametric estimator. Based on the estimator, a bootstrap-assisted testing method for two-way pAUC comparison is established. Moreover, to evaluate possible covariate effects on two-way pAUC, a regression analysis framework is constructed. Asymptotic normalities of the methods are provided. Advantages of the proposed methods are illustrated by simulation and Wisconsin Breast Cancer Data. We encode the methods as a publicly available R package tpAUC.


2020 ◽  
Vol 9 (6) ◽  
pp. 357-364
Author(s):  
Johannes Reschke ◽  
Cornelius Neumann ◽  
Stephan Berlitz

AbstractIn everyday traffic, pedestrians rely on informal communication with other road users. In case of automated vehicles, this communication can be replaced by light signals, which need to be learned beforehand. Prior to an extensive introduction of automated vehicles, a learning phase for these light signals can be set up in manual driving with help of a driver intention prediction. Therefore, a three-staged algorithm consisting of a neural network, a random forest and a conditional stage, is implemented. Using this algorithm, a true-positive rate (TPR) of 94.0% for a 5.0% false-positive rate (FPR) can be achieved. To improve this process, a personalization procedure is implemented, using driver-specific behaviours, resulting in TPRs ranging from 91.5 to 96.6% for a FPR of 5.0%. Transfer learning of neural networks improves the prediction accuracy of almost all drivers. In order to introduce the implemented algorithm in today’s traffic, especially the FPR has to be improved considerably.


2017 ◽  
Vol 2017 ◽  
pp. 1-14 ◽  
Author(s):  
Ilker Unal

ROC curve analysis is often applied to measure the diagnostic accuracy of a biomarker. The analysis results in two gains: diagnostic accuracy of the biomarker and the optimal cut-point value. There are many methods proposed in the literature to obtain the optimal cut-point value. In this study, a new approach, alternative to these methods, is proposed. The proposed approach is based on the value of the area under the ROC curve. This method defines the optimal cut-point value as the value whose sensitivity and specificity are the closest to the value of the area under the ROC curve and the absolute value of the difference between the sensitivity and specificity values is minimum. This approach is very practical. In this study, the results of the proposed method are compared with those of the standard approaches, by using simulated data with different distribution and homogeneity conditions as well as a real data. According to the simulation results, the use of the proposed method is advised for finding the true cut-point.


Electronics ◽  
2020 ◽  
Vol 9 (11) ◽  
pp. 1894
Author(s):  
Chun Guo ◽  
Zihua Song ◽  
Yuan Ping ◽  
Guowei Shen ◽  
Yuhei Cui ◽  
...  

Remote Access Trojan (RAT) is one of the most terrible security threats that organizations face today. At present, two major RAT detection methods are host-based and network-based detection methods. To complement one another’s strengths, this article proposes a phased RATs detection method by combining double-side features (PRATD). In PRATD, both host-side and network-side features are combined to build detection models, which is conducive to distinguishing the RATs from benign programs because that the RATs not only generate traffic on the network but also leave traces on the host at run time. Besides, PRATD trains two different detection models for the two runtime states of RATs for improving the True Positive Rate (TPR). The experiments on the network and host records collected from five kinds of benign programs and 20 famous RATs show that PRATD can effectively detect RATs, it can achieve a TPR as high as 93.609% with a False Positive Rate (FPR) as low as 0.407% for the known RATs, a TPR 81.928% and FPR 0.185% for the unknown RATs, which suggests it is a competitive candidate for RAT detection.


2021 ◽  
pp. 103985622110286
Author(s):  
Tracey Wade ◽  
Jamie-Lee Pennesi ◽  
Yuan Zhou

Objective: Currently eligibility for expanded Medicare items for eating disorders (excluding anorexia nervosa) require a score ⩾ 3 on the 22-item Eating Disorder Examination-Questionnaire (EDE-Q). We compared these EDE-Q “cases” with continuous scores on a validated 7-item version of the EDE-Q (EDE-Q7) to identify an EDE-Q7 cut-off commensurate to 3 on the EDE-Q. Methods: We utilised EDE-Q scores of female university students ( N = 337) at risk of developing an eating disorder. We used a receiver operating characteristic (ROC) curve to assess the relationship between the true-positive rate (sensitivity) and the false-positive rate (1-specificity) of cases ⩾ 3. Results: The area under the curve showed outstanding discrimination of 0.94 (95% CI: .92–.97). We examined two specific cut-off points on the EDE-Q7, which included 100% and 87% of true cases, respectively. Conclusion: Given the EDE-Q cut-off for Medicare is used in conjunction with other criteria, we suggest using the more permissive EDE-Q7 cut-off (⩾2.5) to replace use of the EDE-Q cut-off (⩾3) in eligibility assessments.


2016 ◽  
Vol 24 (2) ◽  
pp. 263-272 ◽  
Author(s):  
Kosuke Imai ◽  
Kabir Khanna

In both political behavior research and voting rights litigation, turnout and vote choice for different racial groups are often inferred using aggregate election results and racial composition. Over the past several decades, many statistical methods have been proposed to address this ecological inference problem. We propose an alternative method to reduce aggregation bias by predicting individual-level ethnicity from voter registration records. Building on the existing methodological literature, we use Bayes's rule to combine the Census Bureau's Surname List with various information from geocoded voter registration records. We evaluate the performance of the proposed methodology using approximately nine million voter registration records from Florida, where self-reported ethnicity is available. We find that it is possible to reduce the false positive rate among Black and Latino voters to 6% and 3%, respectively, while maintaining the true positive rate above 80%. Moreover, we use our predictions to estimate turnout by race and find that our estimates yields substantially less amounts of bias and root mean squared error than standard ecological inference estimates. We provide open-source software to implement the proposed methodology.


Author(s):  
Yosef S. Razin ◽  
Jack Gale ◽  
Jiaojiao Fan ◽  
Jaznae’ Smith ◽  
Karen M. Feigh

This paper evaluates Banks et al.’s Human-AI Shared Mental Model theory by examining how a self-driving vehicle’s hazard assessment facilitates shared mental models. Participants were asked to affirm the vehicle’s assessment of road objects as either hazards or mistakes in real-time as behavioral and subjective measures were collected. The baseline performance of the AI was purposefully low (<50%) to examine how the human’s shared mental model might lead to inappropriate compliance. Results indicated that while the participant true positive rate was high, overall performance was reduced by the large false positive rate, indicating that participants were indeed being influenced by the Al’s faulty assessments, despite full transparency as to the ground-truth. Both performance and compliance were directly affected by frustration, mental, and even physical demands. Dispositional factors such as faith in other people’s cooperativeness and in technology companies were also significant. Thus, our findings strongly supported the theory that shared mental models play a measurable role in performance and compliance, in a complex interplay with trust.


2014 ◽  
Author(s):  
Andreas Tuerk ◽  
Gregor Wiktorin ◽  
Serhat Güler

Quantification of RNA transcripts with RNA-Seq is inaccurate due to positional fragment bias, which is not represented appropriately by current statistical models of RNA-Seq data. This article introduces the Mix2(rd. "mixquare") model, which uses a mixture of probability distributions to model the transcript specific positional fragment bias. The parameters of the Mix2model can be efficiently trained with the Expectation Maximization (EM) algorithm resulting in simultaneous estimates of the transcript abundances and transcript specific positional biases. Experiments are conducted on synthetic data and the Universal Human Reference (UHR) and Brain (HBR) sample from the Microarray quality control (MAQC) data set. Comparing the correlation between qPCR and FPKM values to state-of-the-art methods Cufflinks and PennSeq we obtain an increase in R2value from 0.44 to 0.6 and from 0.34 to 0.54. In the detection of differential expression between UHR and HBR the true positive rate increases from 0.44 to 0.71 at a false positive rate of 0.1. Finally, the Mix2model is used to investigate biases present in the MAQC data. This reveals 5 dominant biases which deviate from the common assumption of a uniform fragment distribution. The Mix2software is available at http://www.lexogen.com/fileadmin/uploads/bioinfo/mix2model.tgz.


2021 ◽  
Author(s):  
Shloak Rathod

<div><div><div><p>The proliferation of online media allows for the rapid dissemination of unmoderated news, unfortunately including fake news. The extensive spread of fake news poses a potent threat to both individuals and society. This paper focuses on designing author profiles to detect authors who are primarily engaged in publishing fake news articles. We build on the hypothesis that authors who write fake news repeatedly write only fake news articles, at least in short-term periods. Fake news authors have a distinct writing style compared to real news authors, who naturally want to maintain trustworthiness. We explore the potential to detect fake news authors by designing authors’ profiles based on writing style, sentiment, and co-authorship patterns. We evaluate our approach using a publicly available dataset with over 5000 authors and 20000 articles. For our evaluation, we build and compare different classes of supervised machine learning models. We find that the K-NN model performed the best, and it could detect authors who are prone to writing fake news with an 83% true positive rate with only a 5% false positive rate.</p></div></div></div>


2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Hongcheng Zou ◽  
Ziling Wei ◽  
Jinshu Su ◽  
Baokang Zhao ◽  
Yusheng Xia ◽  
...  

Website fingerprinting (WFP) attack enables identifying the websites a user is browsing even under the protection of privacy-enhancing technologies (PETs). Previous studies demonstrate that most machine-learning attacks need multiple types of features as input, thus inducing tremendous feature engineering work. However, we show the other alternative. That is, we present Probabilistic Fingerprinting (PF), a new website fingerprinting attack that merely leverages one type of features. They are produced by using a mathematical model PWFP that combines a probabilistic topic model with WFP for the first time, due to a finding that a plain text and the sequence file generated from a traffic instance are essentially the same. Experimental results show that the proposed new features are more distinguishing than the existing features. In a closed-world setting, PF attains a better accuracy performance (99.79% at most) than prior attacks on various datasets gathered in the scenarios of Shadowsocks, SSH, and TLS, respectively. Besides, even when the number of training instances drops to as few as 4, PF still reaches an accuracy of above 90%. In the more realistic open-world setting, PF attains a high true positive rate (TPR) and Bayes detection rate (BDR), and a low false positive rate (FPR) in all evaluations, which outperforms the other attacks. These results highlight that it is meaningful and possible to explore new features to improve the accuracy of WFP attacks.


Sign in / Sign up

Export Citation Format

Share Document