scholarly journals Properly initialized Bayesian Network for decision making leveraging random forest

2020 ◽  
Vol 9 (1) ◽  
pp. 36
Author(s):  
Yutaka Iwakami ◽  
Hironori Takuma ◽  
Motoi Iwashita

Bayesian network is one of major methods for probabilistic inference among items. But if it contains particular targeting node and other explanatory nodes for decision making, for example how to select suitable appealing keywords to make customers like a product, edges around the target should be counted with more importance than those among others while constructing the network. In order to achieve this adjustment, this study proposes to configure initial state consisting of a few nodes and their edges connected with the target. The initial state is obtained by leveraging Random forest which is a proven method for decision making. Initial nodes are extracted by measuring mean decrease of Gini coefficient calculated with decision trees of Random forest. Initial edges are designated by comparing Kullback-Leibler divergences of conditional probability distribution among nodes which are corresponding to edge directions. Through an actual experiment, this method is confirmed to be effective for adjusting Bayesian network in decision making. This approach is especially useful for business scenes, such as selecting preferable keywords for appealing products over SNS.

2019 ◽  
Author(s):  
Adrian S. Wong ◽  
Kangbo Hao ◽  
Zheng Fang ◽  
Henry D. I. Abarbanel

Abstract. Statistical Data Assimilation (SDA) is the transfer of information from field or laboratory observations to a user selected model of the dynamical system producing those observations. The data is noisy and the model has errors; the information transfer addresses properties of the conditional probability distribution of the states of the model conditioned on the observations. The quantities of interest in SDA are the conditional expected values of functions of the model state, and these require the approximate evaluation of high dimensional integrals. We introduce a conditional probability distribution and use the Laplace method with annealing to identify the maxima of the conditional probability distribution. The annealing method slowly increases the precision term of the model as it enters the Laplace method. In this paper, we extend the idea of precision annealing (PA) to Monte Carlo calculations of conditional expected values using Metropolis-Hastings methods.


Author(s):  
Yang Xiang

Graphical models such as Bayesian networks (BNs) (Pearl, 1988) and decomposable Markov networks (DMNs) (Xiang, Wong & Cercone, 1997) have been applied widely to probabilistic reasoning in intelligent systems. Figure1 illustrates a BN and a DMN on a trivial uncertain domain: A virus can damage computer files, and so can a power glitch. A power glitch also causes a VCR to reset. The BN in (a) has four nodes, corresponding to four binary variables taking values from {true, false}. The graph structure encodes a set of dependence and independence assumptions (e.g., that f is directly dependent on v, and p but is independent of r, once the value of p is known). Each node is associated with a conditional probability distribution conditioned on its parent nodes (e.g., P(f | v, p)). The joint probability distribution is the product P(v, p, f, r) = P(f | v, p) P(r | p) P(v) P(p). The DMN in (b) has two groups of nodes that are maximally pair-wise connected, called cliques. Each clique is associated with a probability distribution (e.g., clique {v, p, f} is assigned P(v, p, f)). The joint probability distribution is P(v, p, f, r) = P(v, p, f) P(r, p) / P(p), where P(p) can be derived from one of the clique distributions. The networks, for instance, can be used to reason about whether there are viruses in the computer system, after observations on f and r are made.


1977 ◽  
Vol 80 (1) ◽  
pp. 99-128 ◽  
Author(s):  
Hiroji Nakagawa ◽  
Iehisa Nezu

In this paper we intend to predict the magnitude of the contribution to the Reynolds stress of bursting events: ‘ejections’, ‘sweeps’, ‘inward interactions’ and ‘outward interactions’. We shall do this by making use of the conditional probability distribution of the Reynolds stress −uv, which can be derived by applying the cumulant-discard method to the Gram-Charlier probability distribution of the two variablesuandv. The Reynolds-stress fluctuations in openchannel flows over smooth and rough beds are measured by dual-sensor hot-film anemometers, whose signals are conditionally sampled and sorted into the four quadrants of theu, vplane by using a high-speed digital data processing system.We shall verify that even the third-order conditional probability distribution of the Reynolds stress shows fairly good agreement with the experimental results and that the sequence of events in the bursting process, i.e. ejections, sweeps and interactions, is directly related to the turbulent energy budget in the form of turbulent diffusion. Also, we shall show that the roughness effect is marked in the area from the wall to the middle of the equilibrium region, and that sweeps appear to be more important than ejections as the roughness increases and as the distance from the wall decreases.


2010 ◽  
Vol 224 (2) ◽  
pp. 74-86 ◽  
Author(s):  
Emily R. Stirk ◽  
Grant Lythe ◽  
Hugo A. van den Berg ◽  
Gareth A.D. Hurst ◽  
Carmen Molina-París

Author(s):  
Nick Verheul ◽  
Jan Viebahn ◽  
Daan Crommelin

AbstractIn this study we investigate a covariate-based stochastic approach to parameterize unresolved turbulent processes within a standard model of the idealised, wind-driven ocean circulation. We focus on vertical instead of horizontal coarse-graining, such that we avoid the subtle difficulties of horizontal coarsegraining. The corresponding eddy forcing is uniquely defined and has a clear physical interpretation related to baroclinic instability.We propose to emulate the baroclinic eddy forcing by sampling from the conditional probability distribution functions of the eddy forcing obtained from the baroclinic reference model data. These conditional probability distribution functions are approximated here by sampling uniformly from discrete reference values. We analyze in detail the different performances of the stochastic parameterization dependent on whether the eddy forcing is conditioned on a suitable flow-dependent covariate or on a timelagged covariate or on both. The results demonstrate that our non-Gaussian, non-linear methodology is able to accurately reproduce the first four statistical moments and spatial/temporal correlations of the stream function, energetics, and enstrophy of the reference baroclinic model.


Sign in / Sign up

Export Citation Format

Share Document