Bagging k-dependence Bayesian network classifiers

Bagging has attracted much attention due to its simple implementation and the popularity of bootstrapping. By learning diverse classifiers from resampled datasets and averaging the outcomes, bagging investigates the possibility of achieving substantial classification performance of the base classifier. Diversity has been recognized as a very important characteristic in bagging. This paper presents an efficient and effective bagging approach, that learns a set of independent Bayesian network classifiers (BNCs) from disjoint data subspaces. The number of bits needed to describe the data is measured in terms of log likelihood, and redundant edges are identified to optimize the topologies of the learned BNCs. Our extensive experimental evaluation on 54 publicly available datasets from the UCI machine learning repository reveals that the proposed algorithm achieves a competitive classification performance compared with state-of-the-art BNCs that use or do not use bagging procedures, such as tree-augmented naive Bayes (TAN), k-dependence Bayesian classifier (KDB), bagging NB or bagging TAN.

Download Full-text

Efficient Heuristics for Structure Learning of k-Dependence Bayesian Classifier

Entropy ◽

10.3390/e20120897 ◽

2018 ◽

Vol 20 (12) ◽

pp. 897 ◽

Cited By ~ 4

Author(s):

Yang Liu ◽

Limin Wang ◽

Minghui Sun

Keyword(s):

Bayesian Network ◽

Structure Learning ◽

State Of The Art ◽

Classification Performance ◽

Bayesian Classifier ◽

Bayesian Network Classifiers ◽

Discriminative Model ◽

Minimal Redundancy ◽

Structure Complexity ◽

Maximal Relevance

The rapid growth in data makes the quest for highly scalable learners a popular one. To achieve the trade-off between structure complexity and classification accuracy, the k-dependence Bayesian classifier (KDB) allows to represent different number of interdependencies for different data sizes. In this paper, we proposed two methods to improve the classification performance of KDB. Firstly, we use the minimal-redundancy-maximal-relevance analysis, which sorts the predictive features to identify redundant ones. Then, we propose an improved discriminative model selection to select an optimal sub-model by removing redundant features and arcs in the Bayesian network. Experimental results on 40 UCI datasets demonstrate that these two techniques are complementary and the proposed algorithm achieves competitive classification performance, and less classification time than other state-of-the-art Bayesian network classifiers like tree-augmented naive Bayes and averaged one-dependence estimators.

Download Full-text

MULTICLASS CLASSIFICATION BASED ON META PROBABILITY CODES

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800141100910x ◽

2011 ◽

Vol 25 (08) ◽

pp. 1219-1241 ◽

Cited By ~ 3

Author(s):

NACER FARAJZADEH ◽

GANG PAN ◽

ZHAOHUI WU ◽

MIN YAO

Keyword(s):

Machine Learning ◽

Clustering Algorithm ◽

State Of The Art ◽

Feature Space ◽

Multiclass Classification ◽

Classification Performance ◽

Classification Rate ◽

New Approach ◽

Multiclass Classifier ◽

Original Feature

This paper proposes a new approach to improve multiclass classification performance by employing Stacked Generalization structure and One-Against-One decomposition strategy. The proposed approach encodes the outputs of all pairwise classifiers by implicitly embedding two-class discriminative information in a probabilistic manner. The encoded outputs, called Meta Probability Codes (MPCs), are interpreted as the projections of the original features. It is observed that MPC, compared to the original features, has more appropriate features for clustering. Based on MPC, we introduce a cluster-based multiclass classification algorithm, called MPC-Clustering. The MPC-Clustering algorithm uses the proposed approach to project an original feature space to MPC, and then it employs a clustering scheme to cluster MPCs. Subsequently, it trains individual multiclass classifiers on the produced clusters to complete the procedure of multiclass classifier induction. The performance of the proposed algorithm is extensively evaluated on 20 datasets from the UCI machine learning database repository. The results imply that MPC-Clustering is quite efficient with an improvement of 2.4% overall classification rate compared to the state-of-the-art multiclass classifiers.

Download Full-text

An Optimization Algorithm of Bayesian Network Classifiers by Derivatives of Conditional Log Likelihood

Chinese Journal of Computers ◽

10.3724/sp.j.1016.2012.00364 ◽

2012 ◽

Vol 35 (2) ◽

pp. 364-374 ◽

Cited By ~ 1

Author(s):

Zhong-Feng WANG ◽

Zhi-Hai WANG

Keyword(s):

Bayesian Network ◽

Optimization Algorithm ◽

Bayesian Network Classifiers ◽

Log Likelihood ◽

Derivatives Of

Download Full-text

A novel approach to fully representing the diversity in conditional dependencies for learning Bayesian network classifier

Intelligent Data Analysis ◽

10.3233/ida-194959 ◽

2021 ◽

Vol 25 (1) ◽

pp. 35-55

Author(s):

Limin Wang ◽

Peng Chen ◽

Shenglei Chen ◽

Minghui Sun

Keyword(s):

Bayesian Network ◽

Conditional Independence ◽

Structure Learning ◽

Classification Performance ◽

Data Sets ◽

Independence Assumption ◽

Bayesian Network Classifiers ◽

Novel Approach ◽

Conditional Independence Assumption ◽

Dependence Criterion

Bayesian network classifiers (BNCs) have proved their effectiveness and efficiency in the supervised learning framework. Numerous variations of conditional independence assumption have been proposed to address the issue of NP-hard structure learning of BNC. However, researchers focus on identifying conditional dependence rather than conditional independence, and information-theoretic criteria cannot identify the diversity in conditional (in)dependencies for different instances. In this paper, the maximum correlation criterion and minimum dependence criterion are introduced to sort attributes and identify conditional independencies, respectively. The heuristic search strategy is applied to find possible global solution for achieving the trade-off between significant dependency relationships and independence assumption. Our extensive experimental evaluation on widely used benchmark data sets reveals that the proposed algorithm achieves competitive classification performance compared to state-of-the-art single model learners (e.g., TAN, KDB, KNN and SVM) and ensemble learners (e.g., ATAN and AODE).

Download Full-text

Margin-Based Pareto Ensemble Pruning: An Ensemble Pruning Algorithm That Learns to Search Optimized Ensembles

Computational Intelligence and Neuroscience ◽

10.1155/2019/7560872 ◽

2019 ◽

Vol 2019 ◽

pp. 1-12 ◽

Cited By ~ 2

Author(s):

Ruihan Hu ◽

Songbin Zhou ◽

Yisen Liu ◽

Zhiri Tang

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Classification Performance ◽

Test Set ◽

Pruning Algorithm ◽

Ensemble Pruning ◽

Learning Framework ◽

Classification Tasks ◽

Validation Set ◽

Definition Of

The ensemble pruning system is an effective machine learning framework that combines several learners as experts to classify a test set. Generally, ensemble pruning systems aim to define a region of competence based on the validation set to select the most competent ensembles from the ensemble pool with respect to the test set. However, the size of the ensemble pool is usually fixed, and the performance of an ensemble pool heavily depends on the definition of the region of competence. In this paper, a dynamic pruning framework called margin-based Pareto ensemble pruning is proposed for ensemble pruning systems. The framework explores the optimized ensemble pool size during the overproduction stage and finetunes the experts during the pruning stage. The Pareto optimization algorithm is used to explore the size of the overproduction ensemble pool that can result in better performance. Considering the information entropy of the learners in the indecision region, the marginal criterion for each learner in the ensemble pool is calculated using margin criterion pruning, which prunes the experts with respect to the test set. The effectiveness of the proposed method for classification tasks is assessed using datasets. The results show that margin-based Pareto ensemble pruning can achieve smaller ensemble sizes and better classification performance in most datasets when compared with state-of-the-art models.

Download Full-text

Predicting Facial Biotypes Using Continuous Bayesian Network Classifiers

Complexity ◽

10.1155/2018/4075656 ◽

2018 ◽

Vol 2018 ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Gonzalo A. Ruz ◽

Pamela Araya-Díaz

Keyword(s):

Bayesian Networks ◽

Bayesian Network ◽

Classification Problem ◽

Classification Performance ◽

Machine Learning Techniques ◽

Support Vector ◽

Bayesian Network Classifiers ◽

Tree Construction ◽

Learning Techniques ◽

Vector Machines

Bayesian networks are useful machine learning techniques that are able to combine quantitative modeling, through probability theory, with qualitative modeling, through graph theory for visualization. We apply Bayesian network classifiers to the facial biotype classification problem, an important stage during orthodontic treatment planning. For this, we present adaptations of classical Bayesian networks classifiers to handle continuous attributes; also, we propose an incremental tree construction procedure for tree like Bayesian network classifiers. We evaluate the performance of the proposed adaptations and compare them with other continuous Bayesian network classifiers approaches as well as support vector machines. The results under the classification performance measures, accuracy and kappa, showed the effectiveness of the continuous Bayesian network classifiers, especially for the case when a reduced number of attributes were used. Additionally, the resulting networks allowed visualizing the probability relations amongst the attributes under this classification problem, a useful tool for decision-making for orthodontists.

Download Full-text

Structure Learning of Bayesian Network Based on Adaptive Thresholding

Entropy ◽

10.3390/e21070665 ◽

2019 ◽

Vol 21 (7) ◽

pp. 665

Author(s):

Yang Zhang ◽

Limin Wang ◽

Zhiyi Duan ◽

Minghui Sun

Keyword(s):

Bayesian Network ◽

Structure Learning ◽

Mean Squared Error ◽

Structural Complexity ◽

Classification Performance ◽

Adaptive Thresholding ◽

Bayesian Network Classifiers ◽

Structure Reliability ◽

Adaptive Thresholds ◽

Error Bias

Direct dependencies and conditional dependencies in restricted Bayesian network classifiers (BNCs) are two basic kinds of dependencies. Traditional approaches, such as filter and wrapper, have proved to be beneficial to identify non-significant dependencies one by one, whereas the high computational overheads make them inefficient especially for those BNCs with high structural complexity. Study of the distributions of information-theoretic measures provides a feasible approach to identifying non-significant dependencies in batch that may help increase the structure reliability and avoid overfitting. In this paper, we investigate two extensions to the k-dependence Bayesian classifier, MI-based feature selection, and CMI-based dependence selection. These two techniques apply a novel adaptive thresholding method to filter out redundancy and can work jointly. Experimental results on 30 datasets from the UCI machine learning repository demonstrate that adaptive thresholds can help distinguish between dependencies and independencies and the proposed algorithm achieves competitive classification performance compared to several state-of-the-art BNCs in terms of 0–1 loss, root mean squared error, bias, and variance.

Download Full-text

Evaluation of Post-hoc Explanations for Malaria Detection

10.5753/kdmile.2020.11980 ◽

2020 ◽

Author(s):

Vinícius Araújo ◽

Leandro Marinho

Keyword(s):

Machine Learning ◽

State Of The Art ◽

Classification Performance ◽

Zero Knowledge ◽

Post Hoc

It has been advocated that post-hoc explanation techniques are crucial for increasing the trust in complex Machine Learning (ML) models. However, it is so far not well understood whether such explanation techniques are useful or easy for users to understand. In this work, we explore the extent to which SHAP’s explanations, a state-of-the-art post-hoc explainer, help humans to make better decisions. In the malaria classification scenario, we have designed an experiment with 120 volunteers to understand whether humans, starting with zero knowledge about the classification mechanism, could replicate the complex ML classifier’s performance after having access to the model explanations. Our results show that this is indeed the case, i.e., when presented with the ML model outcomes and the explanations, humans can improve their classification performance, indicating that they understood how the ML model makes its decisions.

Download Full-text

Discriminative Structure Learning of Bayesian Network Classifiers from Training Dataset and Testing Instance

Entropy ◽

10.3390/e21050489 ◽

2019 ◽

Vol 21 (5) ◽

pp. 489 ◽

Cited By ~ 1

Author(s):

Limin Wang ◽

Yang Liu ◽

Musa Mammadov ◽

Minghui Sun ◽

Sikai Qi

Keyword(s):

Bayesian Network ◽

Learning Strategy ◽

Structure Learning ◽

Naive Bayes ◽

Search Space ◽

Naïve Bayes ◽

Bayesian Classifier ◽

Training Data ◽

Training Dataset ◽

Bayesian Network Classifiers

Over recent decades, the rapid growth in data makes ever more urgent the quest for highly scalable Bayesian networks that have better classification performance and expressivity (that is, capacity to respectively describe dependence relationships between attributes in different situations). To reduce the search space of possible attribute orders, k-dependence Bayesian classifier (KDB) simply applies mutual information to sort attributes. This sorting strategy is very efficient but it neglects the conditional dependencies between attributes and is sub-optimal. In this paper, we propose a novel sorting strategy and extend KDB from a single restricted network to unrestricted ensemble networks, i.e., unrestricted Bayesian classifier (UKDB), in terms of Markov blanket analysis and target learning. Target learning is a framework that takes each unlabeled testing instance P as a target and builds a specific Bayesian model Bayesian network classifiers (BNC) P to complement BNC T learned from training data T . UKDB respectively introduced UKDB P and UKDB T to flexibly describe the change in dependence relationships for different testing instances and the robust dependence relationships implicated in training data. They both use UKDB as the base classifier by applying the same learning strategy while modeling different parts of the data space, thus they are complementary in nature. The extensive experimental results on the Wisconsin breast cancer database for case study and other 10 datasets by involving classifiers with different structure complexities, such as Naive Bayes (0-dependence), Tree augmented Naive Bayes (1-dependence) and KDB (arbitrary k-dependence), prove the effectiveness and robustness of the proposed approach.

Download Full-text

Bayesian Constitutionalization: Twitter Sentiment Analysis of the Chilean Constitutional Process through Bayesian Network Classifiers

Mathematics ◽

10.3390/math10020166 ◽

2022 ◽

Vol 10 (2) ◽

pp. 166

Author(s):

Gonzalo A. Ruz ◽

Pablo A. Henríquez ◽

Aldo Mascareño

Keyword(s):

Bayesian Network ◽

Social Order ◽

Social Life ◽

Classification Performance ◽

Evolution Strategy ◽

Left Wing ◽

Real World Data ◽

Bayesian Network Classifiers ◽

Good Classification Performance ◽

Institutional Architecture

Constitutional processes are a cornerstone of modern democracies. Whether revolutionary or institutionally organized, they establish the core values of social order and determine the institutional architecture that governs social life. Constitutional processes are themselves evolutionary practices of mutual learning in which actors, regardless of their initial political positions, continuously interact with each other, demonstrating differences and making alliances regarding different topics. In this article, we develop Tree Augmented Naive Bayes (TAN) classifiers to model the behavior of constituent agents. According to the nature of the constituent dynamics, weights are learned by the model from the data using an evolution strategy to obtain a good classification performance. For our analysis, we used the constituent agents’ communications on Twitter during the installation period of the Constitutional Convention (July–October 2021). In order to differentiate political positions (left, center, right), we applied the developed algorithm to obtain the scores of 882 ballots cast in the first stage of the convention (4 July to 29 September 2021). Then, we used k-means to identify three clusters containing right-wing, center, and left-wing positions. Experimental results obtained using the three constructed datasets showed that using alternative weight values in the TAN construction procedure, inferred by an evolution strategy, yielded improvements in the classification accuracy measured in the test sets compared to the results of the TAN constructed with conditional mutual information, as well as other Bayesian network classifier construction approaches. Additionally, our results may help us to better understand political behavior in constitutional processes and to improve the accuracy of TAN classifiers applied to social, real-world data.

Download Full-text