scholarly journals Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns

2020 ◽  
Vol 34 (06) ◽  
pp. 10077-10084
Author(s):  
YooJung Choi ◽  
Golnoosh Farnadi ◽  
Behrouz Babaki ◽  
Guy Van den Broeck

As machine learning is increasingly used to make real-world decisions, recent research efforts aim to define and ensure fairness in algorithmic decision making. Existing methods often assume a fixed set of observable features to define individuals, but lack a discussion of certain features not being observed at test time. In this paper, we study fairness of naive Bayes classifiers, which allow partial observations. In particular, we introduce the notion of a discrimination pattern, which refers to an individual receiving different classifications depending on whether some sensitive attributes were observed. Then a model is considered fair if it has no such pattern. We propose an algorithm to discover and mine for discrimination patterns in a naive Bayes classifier, and show how to learn maximum-likelihood parameters subject to these fairness constraints. Our approach iteratively discovers and eliminates discrimination patterns until a fair model is learned. An empirical evaluation on three real-world datasets demonstrates that we can remove exponentially many discrimination patterns by only adding a small fraction of them as constraints.

2013 ◽  
Vol 23 (4) ◽  
pp. 787-795 ◽  
Author(s):  
Sona Taheri ◽  
Musa Mammadov

Abstract Naive Bayes is among the simplest probabilistic classifiers. It often performs surprisingly well in many real world applications, despite the strong assumption that all features are conditionally independent given the class. In the learning process of this classifier with the known structure, class probabilities and conditional probabilities are calculated using training data, and then values of these probabilities are used to classify new observations. In this paper, we introduce three novel optimization models for the naive Bayes classifier where both class probabilities and conditional probabilities are considered as variables. The values of these variables are found by solving the corresponding optimization problems. Numerical experiments are conducted on several real world binary classification data sets, where continuous features are discretized by applying three different methods. The performances of these models are compared with the naive Bayes classifier, tree augmented naive Bayes, the SVM, C4.5 and the nearest neighbor classifier. The obtained results demonstrate that the proposed models can significantly improve the performance of the naive Bayes classifier, yet at the same time maintain its simple structure.


2021 ◽  
Vol 2021 (4) ◽  
pp. 406-419
Author(s):  
Farzad Zafarani ◽  
Chris Clifton

Abstract There is increasing awareness of the need to protect individual privacy in the training data used to develop machine learning models. Differential Privacy is a strong concept of protecting individuals. Naïve Bayes is a popular machine learning algorithm, used as a baseline for many tasks. In this work, we have provided a differentially private Naïve Bayes classifier that adds noise proportional to the smooth sensitivity of its parameters. We compare our results to Vaidya, Shafiq, Basu, and Hong [1] which scales noise to the global sensitivity of the parameters. Our experimental results on real-world datasets show that smooth sensitivity significantly improves accuracy while still guaranteeing ɛ-differential privacy.


2013 ◽  
Vol 3 (2) ◽  
pp. 7-15 ◽  
Author(s):  
S. Praveena ◽  
◽  
S.P. Singh ◽  
I.V. Muralikrishna ◽  
◽  
...  

2021 ◽  
Author(s):  
Deniz Ertuncay ◽  
Giovanni Costa

AbstractNear-fault ground motions may contain impulse behavior on velocity records. To calculate the probability of occurrence of the impulsive signals, a large dataset is collected from various national data providers and strong motion databases. The dataset has a large number of parameters which carry information on the earthquake physics, ruptured faults, ground motion parameters, distance between the station and several parts of the ruptured fault. Relation between the parameters and impulsive signals is calculated. It is found that fault type, moment magnitude, distance and azimuth between a site of interest and the surface projection of the ruptured fault are correlated with the impulsiveness of the signals. Separate models are created for strike-slip faults and non-strike-slip faults by using multivariate naïve Bayes classifier method. Naïve Bayes classifier allows us to have the probability of observing impulsive signals. The models have comparable accuracy rates, and they are more consistent on different fault types with respect to previous studies.


Sign in / Sign up

Export Citation Format

Share Document