scholarly journals A Unified Definition of Mutual Information with Applications in Machine Learning

2015 ◽  
Vol 2015 ◽  
pp. 1-12 ◽  
Author(s):  
Guoping Zeng

There are various definitions of mutual information. Essentially, these definitions can be divided into two classes: (1) definitions with random variables and (2) definitions with ensembles. However, there are some mathematical flaws in these definitions. For instance, Class 1 definitions either neglect the probability spaces or assume the two random variables have the same probability space. Class 2 definitions redefine marginal probabilities from the joint probabilities. In fact, the marginal probabilities are given from the ensembles and should not be redefined from the joint probabilities. Both Class 1 and Class 2 definitions assume a joint distribution exists. Yet, they all ignore an important fact that the joint or the joint probability measure is not unique. In this paper, we first present a new unified definition of mutual information to cover all the various definitions and to fix their mathematical flaws. Our idea is to define the joint distribution of two random variables by taking the marginal probabilities into consideration. Next, we establish some properties of the newly defined mutual information. We then propose a method to calculate mutual information in machine learning. Finally, we apply our newly defined mutual information to credit scoring.

2021 ◽  
Vol 40 (5) ◽  
pp. 9471-9484
Author(s):  
Yilun Jin ◽  
Yanan Liu ◽  
Wenyu Zhang ◽  
Shuai Zhang ◽  
Yu Lou

With the advancement of machine learning, credit scoring can be performed better. As one of the widely recognized machine learning methods, ensemble learning has demonstrated significant improvements in the predictive accuracy over individual machine learning models for credit scoring. This study proposes a novel multi-stage ensemble model with multiple K-means-based selective undersampling for credit scoring. First, a new multiple K-means-based undersampling method is proposed to deal with the imbalanced data. Then, a new selective sampling mechanism is proposed to select the better-performing base classifiers adaptively. Finally, a new feature-enhanced stacking method is proposed to construct an effective ensemble model by composing the shortlisted base classifiers. In the experiments, four datasets with four evaluation indicators are used to evaluate the performance of the proposed model, and the experimental results prove the superiority of the proposed model over other benchmark models.


2021 ◽  
Author(s):  
Lun Ai ◽  
Stephen H. Muggleton ◽  
Céline Hocquette ◽  
Mark Gromowski ◽  
Ute Schmid

AbstractGiven the recent successes of Deep Learning in AI there has been increased interest in the role and need for explanations in machine learned theories. A distinct notion in this context is that of Michie’s definition of ultra-strong machine learning (USML). USML is demonstrated by a measurable increase in human performance of a task following provision to the human of a symbolic machine learned theory for task performance. A recent paper demonstrates the beneficial effect of a machine learned logic theory for a classification task, yet no existing work to our knowledge has examined the potential harmfulness of machine’s involvement for human comprehension during learning. This paper investigates the explanatory effects of a machine learned theory in the context of simple two person games and proposes a framework for identifying the harmfulness of machine explanations based on the Cognitive Science literature. The approach involves a cognitive window consisting of two quantifiable bounds and it is supported by empirical evidence collected from human trials. Our quantitative and qualitative results indicate that human learning aided by a symbolic machine learned theory which satisfies a cognitive window has achieved significantly higher performance than human self learning. Results also demonstrate that human learning aided by a symbolic machine learned theory that fails to satisfy this window leads to significantly worse performance than unaided human learning.


2021 ◽  
Vol 1955 (1) ◽  
pp. 012039
Author(s):  
Ji Qi ◽  
Ruicheng Yang ◽  
Pucong Wang

Entropy ◽  
2020 ◽  
Vol 22 (5) ◽  
pp. 526
Author(s):  
Gautam Aishwarya ◽  
Mokshay Madiman

The analogues of Arimoto’s definition of conditional Rényi entropy and Rényi mutual information are explored for abstract alphabets. These quantities, although dependent on the reference measure, have some useful properties similar to those known in the discrete setting. In addition to laying out some such basic properties and the relations to Rényi divergences, the relationships between the families of mutual informations defined by Sibson, Augustin-Csiszár, and Lapidoth-Pfister, as well as the corresponding capacities, are explored.


2018 ◽  
Vol 29 (08) ◽  
pp. 1850075
Author(s):  
Tingyuan Nie ◽  
Xinling Guo ◽  
Mengda Lin ◽  
Kun Zhao

The quantification for the invulnerability of complex network is a fundamental problem in which identifying influential nodes is of theoretical and practical significance. In this paper, we propose a novel definition of centrality named total information (TC) which derives from a local sub-graph being constructed by a node and its neighbors. The centrality is then defined as the sum of the self-information of the node and the mutual information of its neighbor nodes. We use the proposed centrality to identify the importance of nodes through the evaluation of the invulnerability of scale-free networks. It shows both the efficiency and the effectiveness of the proposed centrality are improved, compared with traditional centralities.


2021 ◽  
Vol 37 (3) ◽  
pp. 585-617
Author(s):  
Teresa Bono ◽  
Karen Croxson ◽  
Adam Giles

Abstract The use of machine learning as an input into decision-making is on the rise, owing to its ability to uncover hidden patterns in large data and improve prediction accuracy. Questions have been raised, however, about the potential distributional impacts of these technologies, with one concern being that they may perpetuate or even amplify human biases from the past. Exploiting detailed credit file data for 800,000 UK borrowers, we simulate a switch from a traditional (logit) credit scoring model to ensemble machine-learning methods. We confirm that machine-learning models are more accurate overall. We also find that they do as well as the simpler traditional model on relevant fairness criteria, where these criteria pertain to overall accuracy and error rates for population subgroups defined along protected or sensitive lines (gender, race, health status, and deprivation). We do observe some differences in the way credit-scoring models perform for different subgroups, but these manifest under a traditional modelling approach and switching to machine learning neither exacerbates nor eliminates these issues. The paper discusses some of the mechanical and data factors that may contribute to statistical fairness issues in the context of credit scoring.


1958 ◽  
Vol 10 ◽  
pp. 222-229 ◽  
Author(s):  
J. R. Blum ◽  
H. Chernoff ◽  
M. Rosenblatt ◽  
H. Teicher

Let {Xn} (n = 1, 2 , …) be a stochastic process. The random variables comprising it or the process itself will be said to be interchangeable if, for any choice of distinct positive integers i 1, i 2, H 3 … , ik, the joint distribution of depends merely on k and is independent of the integers i 1, i 2, … , i k. It was shown by De Finetti (3) that the probability measure for any interchangeable process is a mixture of probability measures of processes each consisting of independent and identically distributed random variables.


Sign in / Sign up

Export Citation Format

Share Document