Locally Consistent Bayesian Network Scores for Multi-Relational Data

An important task for relational learning is Bayesian network (BN) structure learning. A fundamental component of structure learning is a model selection score that measures how well a model fits a dataset. We describe a new method that upgrades for multi-relational databases, a log-linear BN score designed for single-table i.i.d. data. Chickering and Meek showed that for i.i.d. data, standard BN scores are locally consistent, meaning that their maxima converge to an optimal model, that represents the data generating distribution {\em and} contains no redundant edges. Our main theorem establishes that if a model selection score is locally consistent for i.i.d. data, then our upgraded gain function is locally consistent for relational data as well. To our knowledge this is the first consistency result for relational structure learning. A novel aspect of our approach is employing a {\em gain function} that compares two models: a current vs. an alternative BN structure. In contrast, previous approaches employed a score that is a function of a single model only. Empirical evaluation on six benchmark relational databases shows that our gain function is also practically useful: On realistic size data sets, it selects informative BN structures with a better data fit than those selected by baseline single-model scores.

Download Full-text

Evaluation of Bayesian Network Structure Learning Using Elephant Swarm Water Search Algorithm

Handbook of Research on Advancements of Swarm Intelligence Algorithms for Solving Real-World Problems - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-3222-5.ch008 ◽

2020 ◽

pp. 139-159

Author(s):

Shahab Wahhab Kareem ◽

Mehmet Cudi Okur

Keyword(s):

Simulated Annealing ◽

Bayesian Network ◽

Network Structure ◽

Structure Learning ◽

Search Algorithm ◽

Confusion Matrix ◽

Data Sets ◽

Greedy Search ◽

Bayesian Network Structure ◽

Bayesian Network Structure Learning

Bayesian networks are useful analytical models for designing the structure of knowledge in machine learning which can represent probabilistic dependency relationships among the variables. The authors present the Elephant Swarm Water Search Algorithm (ESWSA) for Bayesian network structure learning. In the algorithm; Deleting, Reversing, Inserting, and Moving are used to make the ESWSA for reaching the optimal structure solution. Mainly, water search strategy of elephants during drought periods is used in the ESWSA algorithm. The proposed method is compared with Pigeon Inspired Optimization, Simulated Annealing, Greedy Search, Hybrid Bee with Simulated Annealing, and Hybrid Bee with Greedy Search using BDeu score function as a metric for all algorithms. They investigated the confusion matrix performances of these techniques utilizing various benchmark data sets. As presented by the results of evaluations, the proposed algorithm achieves better performance than the other algorithms and produces better scores as well as the better values.

Download Full-text

When Relational-Based Applications Go to NoSQL Databases: A Survey

Information ◽

10.3390/info10070241 ◽

2019 ◽

Vol 10 (7) ◽

pp. 241

Author(s):

Geomar A. Schreiner ◽

Denio Duarte ◽

Ronaldo dos S. Melo

Keyword(s):

Big Data ◽

Comparative Analysis ◽

Relational Databases ◽

Research Area ◽

Relational Data ◽

Data Sets ◽

System Architectures ◽

Nosql Databases ◽

State Of Art ◽

Open Issues

Several data-centric applications today produce and manipulate a large volume of data, the so-called Big Data. Traditional databases, in particular, relational databases, are not suitable for Big Data management. As a consequence, some approaches that allow the definition and manipulation of large relational data sets stored in NoSQL databases through an SQL interface have been proposed, focusing on scalability and availability. This paper presents a comparative analysis of these approaches based on an architectural classification that organizes them according to their system architectures. Our motivation is that wrapping is a relevant strategy for relational-based applications that intend to move relational data to NoSQL databases (usually maintained in the cloud). We also claim that this research area has some open issues, given that most approaches deal with only a subset of SQL operations or give support to specific target NoSQL databases. Our intention with this survey is, therefore, to contribute to the state-of-art in this research area and also provide a basis for choosing or even designing a relational-to-NoSQL data wrapping solution.

Download Full-text

A novel approach to fully representing the diversity in conditional dependencies for learning Bayesian network classifier

Intelligent Data Analysis ◽

10.3233/ida-194959 ◽

2021 ◽

Vol 25 (1) ◽

pp. 35-55

Author(s):

Limin Wang ◽

Peng Chen ◽

Shenglei Chen ◽

Minghui Sun

Keyword(s):

Bayesian Network ◽

Conditional Independence ◽

Structure Learning ◽

Classification Performance ◽

Data Sets ◽

Independence Assumption ◽

Bayesian Network Classifiers ◽

Novel Approach ◽

Conditional Independence Assumption ◽

Dependence Criterion

Bayesian network classifiers (BNCs) have proved their effectiveness and efficiency in the supervised learning framework. Numerous variations of conditional independence assumption have been proposed to address the issue of NP-hard structure learning of BNC. However, researchers focus on identifying conditional dependence rather than conditional independence, and information-theoretic criteria cannot identify the diversity in conditional (in)dependencies for different instances. In this paper, the maximum correlation criterion and minimum dependence criterion are introduced to sort attributes and identify conditional independencies, respectively. The heuristic search strategy is applied to find possible global solution for achieving the trade-off between significant dependency relationships and independence assumption. Our extensive experimental evaluation on widely used benchmark data sets reveals that the proposed algorithm achieves competitive classification performance compared to state-of-the-art single model learners (e.g., TAN, KDB, KNN and SVM) and ensemble learners (e.g., ATAN and AODE).

Download Full-text

A parallel algorithm for Bayesian network structure learning from large data sets

Knowledge-Based Systems ◽

10.1016/j.knosys.2016.07.031 ◽

2017 ◽

Vol 117 ◽

pp. 46-55 ◽

Cited By ~ 31

Author(s):

Anders L. Madsen ◽

Frank Jensen ◽

Antonio Salmerón ◽

Helge Langseth ◽

Thomas D. Nielsen

Keyword(s):

Bayesian Network ◽

Parallel Algorithm ◽

Network Structure ◽

Structure Learning ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Bayesian Network Structure ◽

Bayesian Network Structure Learning

Download Full-text

A Method of Handling Missing Data in the Context of Learning Bayesian Network Structure

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.479-480.906 ◽

2013 ◽

Vol 479-480 ◽

pp. 906-910

Author(s):

Chong Chen ◽

Hua Yu ◽

Ju Yun Wang

Keyword(s):

Missing Data ◽

Bayesian Network ◽

Network Structure ◽

Structure Learning ◽

Learning Algorithm ◽

Original Data ◽

Data Sets ◽

Bayesian Network Structure ◽

Bayesian Network Structure Learning ◽

Context Of Learning

Under the background of learning Bayesian network structure, we proposed a new method based on the KNN algorithm and dynamic Gibbs sampling to fill in the missing data, which is mainly used to solve the problem of how to learn the Bayesian network structure better with missing data sets. The experiments based on Asia Network show that, this method can restore the original data very well, which will make it available to use some Bayesian network structure learning algorithm only based on complete data. This method will expand the scope and improve the effect of Bayesian networks application.

Download Full-text

Falcon Optimization Algorithm for Bayesian Networks Structure Learning

Computer Science ◽

10.7494/csci.2021.22.4.3773 ◽

2021 ◽

Vol 22 (4) ◽

Author(s):

Shahab Wahhab Kareem ◽

Mehmet Cudi Okur

Keyword(s):

Bayesian Networks ◽

Bayesian Network ◽

Optimization Algorithm ◽

Structure Learning ◽

Confusion Matrix ◽

Optimal Solution ◽

Score Function ◽

Scientific Models ◽

Data Sets ◽

Networks Structure

In machine-learning, one of the useful scientific models for producing the structure of knowledge is Bayesian network, which can draw probabilistic dependency relationships between variables. The score and search is a method used for learning the structure of a Bayesian network. The authors apply the Falcon Optimization Algorithm (FOA) as a new approach to learning the structure of Bayesian networks. This paper uses the Reversing, Deleting, Moving and Inserting operations to adopt the FOA for approaching the optimal solution of Bayesian network structure. Essentially, the falcon prey search strategy is used in the FOA algorithm. The result of the proposed technique is compared with Pigeon Inspired optimization, Greedy Search, and Simulated Annealing using the BDeu score function. The authors have also examined the performances of the confusion matrix of these techniques utilizing several benchmark data sets. As shown by the evaluations, the proposed method has more reliable performance than the other algorithms including producing better scores and accuracy values.

Download Full-text