scholarly journals Confidence interval for micro-averaged F1 and macro-averaged F1 scores

Author(s):  
Kanae Takahashi ◽  
Kouji Yamamoto ◽  
Aya Kuchiba ◽  
Tatsuki Koyama

AbstractA binary classification problem is common in medical field, and we often use sensitivity, specificity, accuracy, negative and positive predictive values as measures of performance of a binary predictor. In computer science, a classifier is usually evaluated with precision (positive predictive value) and recall (sensitivity). As a single summary measure of a classifier’s performance, F1 score, defined as the harmonic mean of precision and recall, is widely used in the context of information retrieval and information extraction evaluation since it possesses favorable characteristics, especially when the prevalence is low. Some statistical methods for inference have been developed for the F1 score in binary classification problems; however, they have not been extended to the problem of multi-class classification. There are three types of F1 scores, and statistical properties of these F1 scores have hardly ever been discussed. We propose methods based on the large sample multivariate central limit theorem for estimating F1 scores with confidence intervals.

2018 ◽  
Vol 7 (4.30) ◽  
pp. 170 ◽  
Author(s):  
Oyebayo Ridwan Olaniran ◽  
Mohd Asrul Affendi Bin Abdullah ◽  
Khuneswari A/P Gopal Pillay ◽  
Saidat Fehintola Olaniran

In this paper, we present a new method called Empirical Bayesian Random Forest (EBRF) for binary classification problem. The prior ingredient for the method was obtained using the bootstrap prior technique. EBRF addresses explicitly low accuracy problem in Random Forest (RF) classifier when the number of relevant input variables is relatively lower compared to the total number of input variables. The improvement was achieved by replacing the arbitrary subsample variable size with empirical Bayesian estimate.  An illustration of the proposed, and existing methods was performed using five high-dimensional microarray datasets that emanated from colon, breast, lymphoma and Central Nervous System (CNS) cancer tumours. Results from the data analysis revealed that EBRF provides reasonably higher accuracy, sensitivity, specificity and Area Under Receiver Operating Characteristics Curve (AUC) than RF in most of the datasets used.


2007 ◽  
Vol 16 (01) ◽  
pp. 1-15 ◽  
Author(s):  
LI ZHANG ◽  
WEI-DA ZHOU ◽  
TIAN-TIAN SU ◽  
LI-CHENG JIAO

A new multi-class classifier, decision tree SVM (DTSVM) which is a binary decision tree with a very simple structure is presented in this paper. In DTSVM, a problem of multi-class classification is decomposed into a series of ones of binary classification. Here, the binary decision tree is generated by using kernel clustering algorithm, and each non-leaf node represents one binary classification problem. By compared with the other multi-class classification methods based on the binary classification SVMs, the scale and the complexity of DTSVM are less, smaller number of support vectors are needed, and has faster test speed. The final simulation results confirm the feasibility and the validity of DTSVM.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
E. Ghasemian ◽  
M. K. Tavassoly

AbstractWe present a theoretical scheme for the generation of stationary entangled states. To achieve the purpose we consider an open quantum system consisting of a two-qubit plunged in a thermal bath, as the source of dissipation, and then analytically solve the corresponding quantum master equation. We generate two classes of stationary entangled states including the Werner-like and maximally entangled mixed states. In this regard, since the solution of the system depends on its initial state, we can manipulate it and construct robust Bell-like state. In the continuation, we analytically obtain the population and coherence of the considered two-qubit system and show that the residual coherence can be maintained even in the equilibrium condition. Finally, we successfully encode our two-qubit system to solve a binary classification problem. We demonstrate that, the introduced classifiers present high accuracy without requiring any iterative method. In addition, we show that the quantum based classifiers beat the classical ones.


2020 ◽  
Vol 30 (1) ◽  
Author(s):  
Michael O. Olusola ◽  
Sydney I. Onyeagu

This paper is centred on a binary classification problem in which it is desired to assign a new object with multivariate features to one of two distinct populations as based on historical sets of samples from two populations. A linear discriminant analysis framework has been proposed, called the minimised sum of deviations by proportion (MSDP) to model the binary classification problem. In the MSDP formulation, the sum of the proportion of exterior deviations is minimised subject to the group separation constraints, the normalisation constraint, the upper bound constraints on proportions of exterior deviations and the sign unrestriction vis-à-vis the non-negativity constraints. The two-phase method in linear programming is adopted as a solution technique to generate the discriminant function. The decision rule on group-membership prediction is constructed using the apparent error rate. The performance of the MSDP has been compared with some existing linear discriminant models using a previously published dataset on road casualties. The MSDP model was more promising and well suited for the imbalanced dataset on road casualties.


2019 ◽  
Vol 9 (15) ◽  
pp. 3007
Author(s):  
Dengyong Zhang ◽  
Shanshan Wang ◽  
Jin Wang ◽  
Arun Kumar Sangaiah ◽  
Feng Li ◽  
...  

There are many image resizing techniques, which include scaling, scale-and-stretch, seam carving, and so on. They have their own advantages and are suitable for different application scenarios. Therefore, a universal detection of tampering by image resizing is more practical. By preliminary experiments, we found that no matter which image resizing technique is adopted, it will destroy local texture and spatial correlations among adjacent pixels to some extent. Due to the excellent performance of local Tchebichef moments (LTM) in texture classification, we are motivated to present a detection method of tampering by image resizing using LTM in this paper. The tampered images are obtained by removing the pixels from original images using image resizing (scaling, scale-and-stretch and seam carving). Firstly, the residual is obtained by image pre-processing. Then, the histogram features of LTM are extracted from the residual. Finally, an error-correcting output code strategy is adopted by ensemble learning, which turns a multi-class classification problem into binary classification sub-problems. Experimental results show that the proposed approach can obtain an acceptable detection accuracies for the three content-aware image re-targeting techniques.


Author(s):  
Shaohua Kevin Zhou ◽  
Jie Shao ◽  
Bogdan Georgescu ◽  
Dorin Comaniciu

Motion estimation necessitates an appropriate choice of similarity function. Because generic similarity functions derived from simple assumptions are insufficient to model complex yet structured appearance variations in motion estimation, the authors propose to learn a discriminative similarity function to match images under varying appearances by casting image matching into a binary classification problem. They use the LogitBoost algorithm to learn the classifier based on an annotated database that exemplifies the structured appearance variations: An image pair in correspondence is positive and an image pair out of correspondence is negative. To leverage the additional distance structure of negatives, they present a location-sensitive cascade training procedure that bootstraps negatives for later stages of the cascade from the regions closer to the positives, which enables viewing a large number of negatives and steering the training process to yield lower training and test errors. The authors apply the learned similarity function to estimating the motion for the endocardial wall of left ventricle in echocardiography and to performing visual tracking. They obtain improved performances when comparing the learned similarity function with conventional ones.


AI Magazine ◽  
2012 ◽  
Vol 33 (2) ◽  
pp. 79 ◽  
Author(s):  
Philip A. Warrick ◽  
Emily F. Hamilton ◽  
Robert E. Kearney ◽  
Doina Precup

Labor monitoring is crucial in modern health care, as it can be used to detect (and help avoid) significant problems with the fetus. In this article we focus on detecting hypoxia (or oxygen deprivation), a very serious condition that can arise from different pathologies and can lead to life-long disability and death. We present a novel approach to hypoxia detection based on recordings of the uterine pressure and fetal heart rate, which are obtained using standard labor monitoring devices. The key idea is to learn models of the fetal response to signals from its environment. Then, we use the parameters of these models as attributes in a binary classification problem. A running count of pathological classifications over several time periods is taken to provide the current label for the fetus. We use a unique database of real clinical recordings, both from normal and pathological cases. Our approach classifies correctly more than half the pathological cases, 1.5 hours before delivery. These are cases that were missed by clinicians; early detection of this type would have allowed the physician to perform a Caesarean section, possibly avoiding the negative outcome.


Author(s):  
DAYAN MANOHAR SIVALINGAM ◽  
NARENKUMAR PANDIAN ◽  
JEZEKIEL BEN-ARIE

In this work, we develop an efficient technique to transform a multiclass recognition problem into a minimal binary classification problem using the Minimal Classification Method (MCM). The MCM requires only log 2 N classifications whereas the other methods require much more. For the classification, we use Support Vector Machine (SVM) based binary classifiers since they have superior generalization performance. Unlike the prevalent one-versus-one strategy (the bottom-up one-versus-one strategy is called tournament method) that separates only two classes at each classification, the binary classifiers in our method have to separate two groups of multiple classes. As a result, the probability of generalization error increases. This problem is alleviated by utilizing error correcting codes, which results only in a marginal increase in the required number of classifications. However, in comparison to the tournament method, our method requires only 50% of the classifications and still similar performance can be attained. The proposed solution is tested with the Columbia Object Image Library (COIL). We also test the performance under conditions of noise and occlusion.


Sign in / Sign up

Export Citation Format

Share Document