scholarly journals On the Parameterized Complexity of Polytree Learning

Author(s):  
Niels Grüttemeier ◽  
Christian Komusiewicz ◽  
Nils Morawietz

A Bayesian network is a directed acyclic graph that represents statistical dependencies between variables of a joint probability distribution. A fundamental task in data science is to learn a Bayesian network from observed data. Polytree Learning is the problem of learning an optimal Bayesian network that fulfills the additional property that its underlying undirected graph is a forest. In this work, we revisit the complexity of Polytree Learning. We show that Polytree Learning can be solved in single-exponential FPT time for the number of variables. Moreover, we consider the influence of d, the number of variables that might receive a nonempty parent set in the final DAG on the complexity of Polytree Learning. We show that Polytree Learning is presumably not fixed-parameter tractable for d, unlike Bayesian network learning which is fixed-parameter tractable for d. Finally, we show that if d and the maximum parent set size are bounded, then we can obtain efficient algorithms.

2011 ◽  
Vol 474-476 ◽  
pp. 924-927 ◽  
Author(s):  
Xiao Xin

Given an undirected graph G=(V, E) with real nonnegative weights and + or – labels on its edges, the correlation clustering problem is to partition the vertices of G into clusters to minimize the total weight of cut + edges and uncut – edges. This problem is APX-hard and has been intensively studied mainly from the viewpoint of polynomial time approximation algorithms. By way of contrast, a fixed-parameter tractable algorithm is presented that takes treewidth as the parameter, with a running time that is linear in the number of vertices of G.


Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.


2018 ◽  
Vol 2 (1) ◽  
pp. 62
Author(s):  
Hasniati Hasniati ◽  
Arianti Arianti ◽  
William Philip

Bayesian Network dapat digunakan untuk menghitung probabilitas dari kehadiran berbagai gejala penyakit. Dalam tulisan ini, penulis menerapkan bayesian network model untuk menghitung probabilitas penyakit sesak nafas pada bayi. Bayesian network diterapkan berdasar pada data yang diperoleh melalui wawancara kepada dokter spesialis anak yaitu data nama penyakit, penyebab, dan gejala penyakit sesak nafas pada bayi. Struktur Bayesian Network penyakit sesak nafas bayi dibuat berdasarkan ada tidaknya keterkaitan antara gejala terhadap penyakit sesak nafas. Untuk setiap gejala yang direpresentasikan pada struktur bayesian network mempunyai estimasi parameter yang didapat dari data yang telah ada atau pengetahuan dari dokter spesialis. Data estimasi ini disebut nilai prior probaility atau nilai kepercayaan dari gejala penyakit sesak nafas bayi. Setelah diketahui prior probability, langkah berikutnya adalah menentukan Conditional probability (peluang bersyarat) antara jenis penyakit sesak nafas dengan masing-masing gejalanya. Pada langkah akhir, nilai posterior probability dihitung dengan mengambil nilai hasil joint probability distribution (JPD) yang telah diperoleh, kemudian nilai inilah yang digunakan untuk menghitung probabilitas kemunculan suatu gejala. Dengan mengambil satu contoh kasus bahwa bayi memiliki gejala sesak, lemah, gelisah dan demam, disimpulkan bahwa bayi menderita penyakit sesak nafas Pneumoni Neonatal sebesar 0,1688812743.


Algorithmica ◽  
2020 ◽  
Author(s):  
Benjamin Bergougnoux ◽  
Eduard Eiben ◽  
Robert Ganian ◽  
Sebastian Ordyniak ◽  
M. S. Ramanujan

Abstract In the Directed Feedback Vertex Set (DFVS) problem, the input is a directed graph D and an integer k. The objective is to determine whether there exists a set of at most k vertices intersecting every directed cycle of D. DFVS was shown to be fixed-parameter tractable when parameterized by solution size by Chen et al. (J ACM 55(5):177–186, 2008); since then, the existence of a polynomial kernel for this problem has become one of the largest open problems in the area of parameterized algorithmics. Since this problem has remained open in spite of the best efforts of a number of prominent researchers and pioneers in the field, a natural step forward is to study the kernelization complexity of DFVS parameterized by a natural larger parameter. In this paper, we study DFVS parameterized by the feedback vertex set number of the underlying undirected graph. We provide two main contributions: a polynomial kernel for this problem on general instances, and a linear kernel for the case where the input digraph is embeddable on a surface of bounded genus.


Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.


2021 ◽  
Vol 9 ◽  
Author(s):  
Enrique Hernández-Lemus

A random field is the representation of the joint probability distribution for a set of random variables. Markov fields, in particular, have a long standing tradition as the theoretical foundation of many applications in statistical physics and probability. For strictly positive probability densities, a Markov random field is also a Gibbs field, i.e., a random field supplemented with a measure that implies the existence of a regular conditional distribution. Markov random fields have been used in statistical physics, dating back as far as the Ehrenfests. However, their measure theoretical foundations were developed much later by Dobruschin, Lanford and Ruelle, as well as by Hammersley and Clifford. Aside from its enormous theoretical relevance, due to its generality and simplicity, Markov random fields have been used in a broad range of applications in equilibrium and non-equilibrium statistical physics, in non-linear dynamics and ergodic theory. Also in computational molecular biology, ecology, structural biology, computer vision, control theory, complex networks and data science, to name but a few. Often these applications have been inspired by the original statistical physics approaches. Here, we will briefly present a modern introduction to the theory of random fields, later we will explore and discuss some of the recent applications of random fields in physics, biology and data science. Our aim is to highlight the relevance of this powerful theoretical aspect of statistical physics and its relation to the broad success of its many interdisciplinary applications.


Author(s):  
Marco F. Ramoni ◽  
Paola Sebastiani

Born at the intersection of artificial intelligence, statistics, and probability, Bayesian networks (Pearl, 1988) are a representation formalism at the cutting edge of knowledge discovery and data mining (Heckerman, 1997). Bayesian networks belong to a more general class of models called probabilistic graphical models (Whittaker, 1990; Lauritzen, 1996) that arise from the combination of graph theory and probability theory, and their success rests on their ability to handle complex probabilistic models by decomposing them into smaller, amenable components. A probabilistic graphical model is defined by a graph, where nodes represent stochastic variables and arcs represent dependencies among such variables. These arcs are annotated by probability distribution shaping the interaction between the linked variables. A probabilistic graphical model is called a Bayesian network, when the graph connecting its variables is a directed acyclic graph (DAG). This graph represents conditional independence assumptions that are used to factorize the joint probability distribution of the network variables, thus making the process of learning from a large database amenable to computations. A Bayesian network induced from data can be used to investigate distant relationships between variables, as well as making prediction and explanation, by computing the conditional probability distribution of one variable, given the values of some others.


Sign in / Sign up

Export Citation Format

Share Document