scholarly journals Notes on the H-measure of classifier performance

Author(s):  
D. J. Hand ◽  
C. Anagnostopoulos

AbstractThe H-measure is a classifier performance measure which takes into account the context of application without requiring a rigid value of relative misclassification costs to be set. Since its introduction in 2009 it has become widely adopted. This paper answers various queries which users have raised since its introduction, including questions about its interpretation, the choice of a weighting function, whether it is strictly proper, its coherence, and relates the measure to other work.

2016 ◽  
Author(s):  
Julian Zubek ◽  
Dariusz M Plewczynski

We describe a method for assessing data set complexity based on the estimation of the underlining probability distribution and Hellinger distance. Contrary to some popular measures it is not focused on the shape of decision boundary in a classification task but on the amount of available data with respect to attribute structure. Complexity is expressed in terms of graphical plot, which we call complexity curve. We use it to propose a new variant of learning curve plot called generalisation curve. Generalisation curve is a standard learning curve with x-axis rescaled according to the data set complexity curve. It is a classifier performance measure, which shows how well the information present in the data is utilised. We perform theoretical and experimental examination of properties of the introduced complexity measure and show its relation to the variance component of classification error. We compare it with popular data complexity measures on 81 diverse data sets and show that it can contribute to explaining the performance of specific classifiers on these sets. Then we apply our methodology to a panel of benchmarks of standard machine learning algorithms on typical data sets, demonstrating how it can be used in practice to gain insights into data characteristics and classifier behaviour. Moreover, we show that complexity curve is an effective tool for reducing the size of the training set (data pruning), allowing to significantly speed up the learning process without reducing classification accuracy. Associated code is available to download at: https://github.com/zubekj/complexity_curve (open source Python implementation).


2007 ◽  
Vol 04 (04) ◽  
pp. 339-346
Author(s):  
TSANG-LONG PAO ◽  
YUN-MAW CHENG ◽  
YU-TE CHEN ◽  
JUN-HENG YEH

Since emotion is important in influencing cognition, perception of daily activities such as learning, communication and even rational decision-making, it must be considered in human-computer interaction. In this paper, we compare four different weighting functions in weighted KNN-based classifiers to recognize five emotions, including anger, happiness, sadness, neutral and boredom, from Mandarin emotional speech. The classifiers studied include weighted KNN, weighted CAP, and weighted D-KNN. We use the result of traditional KNN classifier as the line performance measure. The experimental results show that the used Fibonacci weighting function outperforms others in all weighted classifiers. The highest accuracy achieves 81.4% with weighted D-KNN classifier.


2016 ◽  
Author(s):  
Julian Zubek ◽  
Dariusz M Plewczynski

We describe a method for assessing data set complexity based on the estimation of the underlining probability distribution and Hellinger distance. Contrary to some popular measures it is not focused on the shape of decision boundary in a classification task but on the amount of available data with respect to attribute structure. Complexity is expressed in terms of graphical plot, which we call complexity curve. We use it to propose a new variant of learning curve plot called generalisation curve. Generalisation curve is a standard learning curve with x-axis rescaled according to the data set complexity curve. It is a classifier performance measure, which shows how well the information present in the data is utilised. We perform theoretical and experimental examination of properties of the introduced complexity measure and show its relation to the variance component of classification error. We compare it with popular data complexity measures on 81 diverse data sets and show that it can contribute to explaining the performance of specific classifiers on these sets. Then we apply our methodology to a panel of benchmarks of standard machine learning algorithms on typical data sets, demonstrating how it can be used in practice to gain insights into data characteristics and classifier behaviour. Moreover, we show that complexity curve is an effective tool for reducing the size of the training set (data pruning), allowing to significantly speed up the learning process without reducing classification accuracy. Associated code is available to download at: https://github.com/zubekj/complexity_curve (open source Python implementation).


2011 ◽  
Author(s):  
Yih-teen Lee ◽  
Alfred Stettler ◽  
John Antonakis

2019 ◽  
Author(s):  
Erick Pusck Wilke ◽  
Benny Kramer Costa ◽  
Otávio Bandeira De Lamônica Freire ◽  
Manuel Portugal Ferreira

2004 ◽  
Vol 80 (3) ◽  
pp. 408
Author(s):  
Roberto Marangoni ◽  
Fabio Marroni ◽  
Domenico Gioffré ◽  
Francesco Ghetti ◽  
Giuliano Colombetti
Keyword(s):  

CFA Digest ◽  
2003 ◽  
Vol 33 (1) ◽  
pp. 51-52
Author(s):  
Frank T. Magiera
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document