scholarly journals Locally Consistent Bayesian Network Scores for Multi-Relational Data

Author(s):  
Oliver Schulte ◽  
Sajjad Gholami

An important task for relational learning is Bayesian network (BN) structure learning. A fundamental component of structure learning is a model selection score that measures how well a model fits a dataset. We describe a new method that upgrades for multi-relational databases, a log-linear BN score designed for single-table i.i.d. data. Chickering and Meek showed that for i.i.d. data, standard BN scores are locally consistent, meaning that their maxima converge to an optimal model, that represents the data generating distribution {\em and} contains no redundant edges. Our main theorem establishes that if a model selection score is locally consistent for i.i.d. data, then our upgraded gain function is locally consistent for relational data as well. To our knowledge this is the first consistency result for relational structure learning. A novel aspect of our approach is employing a {\em gain function} that compares two models: a current vs. an alternative BN structure. In contrast, previous approaches employed a score that is a function of a single model only. Empirical evaluation on six benchmark relational databases shows that our gain function is also practically useful: On realistic size data sets, it selects informative BN structures with a better data fit than those selected by baseline single-model scores.

Author(s):  
Shahab Wahhab Kareem ◽  
Mehmet Cudi Okur

Bayesian networks are useful analytical models for designing the structure of knowledge in machine learning which can represent probabilistic dependency relationships among the variables. The authors present the Elephant Swarm Water Search Algorithm (ESWSA) for Bayesian network structure learning. In the algorithm; Deleting, Reversing, Inserting, and Moving are used to make the ESWSA for reaching the optimal structure solution. Mainly, water search strategy of elephants during drought periods is used in the ESWSA algorithm. The proposed method is compared with Pigeon Inspired Optimization, Simulated Annealing, Greedy Search, Hybrid Bee with Simulated Annealing, and Hybrid Bee with Greedy Search using BDeu score function as a metric for all algorithms. They investigated the confusion matrix performances of these techniques utilizing various benchmark data sets. As presented by the results of evaluations, the proposed algorithm achieves better performance than the other algorithms and produces better scores as well as the better values.


Information ◽  
2019 ◽  
Vol 10 (7) ◽  
pp. 241
Author(s):  
Geomar A. Schreiner ◽  
Denio Duarte ◽  
Ronaldo dos S. Melo

Several data-centric applications today produce and manipulate a large volume of data, the so-called Big Data. Traditional databases, in particular, relational databases, are not suitable for Big Data management. As a consequence, some approaches that allow the definition and manipulation of large relational data sets stored in NoSQL databases through an SQL interface have been proposed, focusing on scalability and availability. This paper presents a comparative analysis of these approaches based on an architectural classification that organizes them according to their system architectures. Our motivation is that wrapping is a relevant strategy for relational-based applications that intend to move relational data to NoSQL databases (usually maintained in the cloud). We also claim that this research area has some open issues, given that most approaches deal with only a subset of SQL operations or give support to specific target NoSQL databases. Our intention with this survey is, therefore, to contribute to the state-of-art in this research area and also provide a basis for choosing or even designing a relational-to-NoSQL data wrapping solution.


2021 ◽  
Vol 25 (1) ◽  
pp. 35-55
Author(s):  
Limin Wang ◽  
Peng Chen ◽  
Shenglei Chen ◽  
Minghui Sun

Bayesian network classifiers (BNCs) have proved their effectiveness and efficiency in the supervised learning framework. Numerous variations of conditional independence assumption have been proposed to address the issue of NP-hard structure learning of BNC. However, researchers focus on identifying conditional dependence rather than conditional independence, and information-theoretic criteria cannot identify the diversity in conditional (in)dependencies for different instances. In this paper, the maximum correlation criterion and minimum dependence criterion are introduced to sort attributes and identify conditional independencies, respectively. The heuristic search strategy is applied to find possible global solution for achieving the trade-off between significant dependency relationships and independence assumption. Our extensive experimental evaluation on widely used benchmark data sets reveals that the proposed algorithm achieves competitive classification performance compared to state-of-the-art single model learners (e.g., TAN, KDB, KNN and SVM) and ensemble learners (e.g., ATAN and AODE).


2017 ◽  
Vol 117 ◽  
pp. 46-55 ◽  
Author(s):  
Anders L. Madsen ◽  
Frank Jensen ◽  
Antonio Salmerón ◽  
Helge Langseth ◽  
Thomas D. Nielsen

2013 ◽  
Vol 479-480 ◽  
pp. 906-910
Author(s):  
Chong Chen ◽  
Hua Yu ◽  
Ju Yun Wang

Under the background of learning Bayesian network structure, we proposed a new method based on the KNN algorithm and dynamic Gibbs sampling to fill in the missing data, which is mainly used to solve the problem of how to learn the Bayesian network structure better with missing data sets. The experiments based on Asia Network show that, this method can restore the original data very well, which will make it available to use some Bayesian network structure learning algorithm only based on complete data. This method will expand the scope and improve the effect of Bayesian networks application.


2021 ◽  
Vol 22 (4) ◽  
Author(s):  
Shahab Wahhab Kareem ◽  
Mehmet Cudi Okur

In machine-learning, one of the useful scientific models for producing the structure of knowledge is Bayesian network, which can draw probabilistic dependency relationships between variables. The score and search is a method used for learning the structure of a Bayesian network. The authors apply the Falcon Optimization Algorithm (FOA) as a new approach to learning the structure of Bayesian networks. This paper uses the Reversing, Deleting, Moving and Inserting operations to adopt the FOA for approaching the optimal solution of Bayesian network structure. Essentially, the falcon prey search strategy is used in the FOA algorithm. The result of the proposed technique is compared with Pigeon Inspired optimization, Greedy Search, and Simulated Annealing using the BDeu score function. The authors have also examined the performances of the confusion matrix of these techniques utilizing several benchmark data sets. As shown by the evaluations, the proposed method has more reliable performance than the other algorithms including producing better scores and accuracy values.


IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 116661-116675 ◽  
Author(s):  
Yuguang Long ◽  
Limin Wang ◽  
Zhiyi Duan ◽  
Minghui Sun

Sign in / Sign up

Export Citation Format

Share Document