scholarly journals Bayesian Maximal Information Coefficient (BMIC) to Reason Novel Trends in Large Datasets

Author(s):  
Shuliang Wang ◽  
Tisinee Surapunt

Abstract Bayesian network (BN) is a probability inference model to describe the explicit relationship of cause and effect, which may examine the complex system of rice price with data uncertainty. However, discovering the optimized structure from a super-exponential number of graphs in the search space is an NP-hard problem. In this paper, Bayesian maximal information coefficient (BMIC) is proposed to uncover the causal correlations from a large dataset in a random system by integrating probabilistic graphical model (PGM) and maximal information coefficient (MIC) with Bayesian linear regression (BLR). First, MIC is to capture the strong dependence between predictor variables and a target variable to reduce the number of variables for the BN structural learning of PGM. Second BLR is to assign orientation in a graph resulting by a posterior probability distribution. It conforms to what BN needs to acquire a conditional probability distribution when given the parents for each node by the Bayes' Theorem. Third, Bayesian information criterion (BIC) is treated as an indicator to determine the well-explained model with its data to ensure correctness. The score shows that the proposed method obtains the highest score compared to the two traditional learning algorithms. Finally, the BMIC is applied to discover the causal correlations from the large dataset on Thai rice price by identifying causality change in the paddy price of Jasmine rice. The experimented results show the proposed BMIC returns the directional relationships with clue to identify the cause(s) and effect(s) on paddy price with better heuristic search.

2021 ◽  
Author(s):  
Shuliang Wang ◽  
Tisinee Surapunt

Abstract Bayesian network (BN) is a probability inference model to describe the explicit relationship of cause and effect, which may examine the complex system of rice price with data uncertainty. However, discovering the optimized structure from a super-exponential number of graphs in the search space is an NP-hard problem. In this paper, Bayesian maximal information coefficient (BMIC) is proposed to uncover the causal correlations from a large dataset in a random system by integrating probabilistic graphical model (PGM) and maximal information coefficient (MIC) with Bayesian linear regression (BLR). First, MIC is to capture the strong dependence between predictor variables and a target variable to reduce the number of variables for the BN structural learning of PGM. Second BLR is to assign orientation in a graph resulting by a posterior probability distribution. It conforms to what BN needs to acquire a conditional probability distribution when given the parents for each node by the Bayes' Theorem. Third, Bayesian information criterion (BIC) is treated as an indicator to determine the well-explained model with its data to ensure correctness. The score shows that the proposed method obtains the highest score compared to the two traditional learning algorithms. Finally, the BMIC is applied to discover the causal correlations from the large dataset on Thai rice price by identifying causality change in the paddy price of Jasmine rice. The experimented results show the proposed BMIC returns the directional relationships with clue to identify the cause(s) and effect(s) on paddy price with better heuristic search.


2021 ◽  
Author(s):  
Shuliang Wang ◽  
Tisinee Surapunt

Abstract Bayesian network (BN) is a probability inference model to describe the explicit relationship of cause and effect, which may examine the complex system of rice price with data uncertainty. However, discovering the optimized structure from a super-exponential number of graphs in the search space is an NP-hard problem. In this paper, the Bayesian maximal information coefficient (BMIC) is proposed to uncover the causal correlations from a large dataset in a random system by integrating a probabilistic graphical model (PGM) and maximal information coefficient (MIC) with Bayesian linear regression (BLR). First, MIC is to capture the strong dependence between predictor variables and a target variable for reducing the number of variables during the Bayesian network structural learning. Second BLR is to assign orientation in a graph resulting in a posterior probability distribution. It conforms to what BN needs to acquire a conditional probability distribution when given the parents for each node by the Bayes’ Theorem. Third, the Bayesian information criterion (BIC) is treated as an indicator to determine the well-explained model with its data to ensure correctness. The score shows that the proposed method obtains the highest score compared to the two traditional learning algorithms. Finally, the BMIC is applied to discover the causal correlations from large dataset on Thai rice price by identifying the causality change in the paddy price of Jasmine rice. The experimented results show the BMIC returns the directed relationships with a clue to identify the cause(s) and effect(s) on paddy price with the better heuristic search.


2014 ◽  
Vol 111 (33) ◽  
pp. E3362-E3363 ◽  
Author(s):  
D. N. Reshef ◽  
Y. A. Reshef ◽  
M. Mitzenmacher ◽  
P. C. Sabeti

2013 ◽  
Vol 24 (5) ◽  
pp. 845-852 ◽  
Author(s):  
Shih-Chang Lee ◽  
Ning-Ning Pang ◽  
Wen-Jer Tzeng

Author(s):  
Munir S Pathan ◽  
S M Pradhan ◽  
T Palani Selvam

Abstract In this study, the Bayesian probabilistic approach is applied for the estimation of the actual dose using personnel monitoring dose records of occupational workers. To implement the Bayesian approach, the probability distribution of the uncertainty in the reported dose as a function of the actual dose is derived. Using the uncertainty distribution function of reported dose and prior knowledge of dose levels generally observed in a monitoring period, the posterior probability distribution of the actual dose is estimated. The posterior distributions of each monitoring period in a year are convoluted to arrive at actual annual dose distribution. The estimated actual doses distributions show a significant deviation from reported annual doses particularly for low annual doses.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Maria Sole Morelli ◽  
Alberto Greco ◽  
Gaetano Valenza ◽  
Alberto Giannoni ◽  
Michele Emdin ◽  
...  

2020 ◽  
Vol 09 (04) ◽  
pp. 2050017
Author(s):  
Benjamin D. Donovan ◽  
Randall L. McEntaffer ◽  
Casey T. DeRoo ◽  
James H. Tutt ◽  
Fabien Grisé ◽  
...  

The soft X-ray grating spectrometer on board the Off-plane Grating Rocket Experiment (OGRE) hopes to achieve the highest resolution soft X-ray spectrum of an astrophysical object when it is launched via suborbital rocket. Paramount to the success of the spectrometer are the performance of the [Formula: see text] reflection gratings populating its reflection grating assembly. To test current grating fabrication capabilities, a grating prototype for the payload was fabricated via electron-beam lithography at The Pennsylvania State University’s Materials Research Institute and was subsequently tested for performance at Max Planck Institute for Extraterrestrial Physics’ PANTER X-ray Test Facility. Bayesian modeling of the resulting data via Markov chain Monte Carlo (MCMC) sampling indicated that the grating achieved the OGRE single-grating resolution requirement of [Formula: see text] at the 94% confidence level. The resulting [Formula: see text] posterior probability distribution suggests that this confidence level is likely a conservative estimate though, since only a finite [Formula: see text] parameter space was sampled and the model could not constrain the upper bound of [Formula: see text] to less than infinity. Raytrace simulations of the tested system found that the observed data can be reproduced with a grating performing at [Formula: see text]. It is therefore postulated that the behavior of the obtained [Formula: see text] posterior probability distribution can be explained by a finite measurement limit of the system and not a finite limit on [Formula: see text]. Implications of these results and improvements to the test setup are discussed.


Sign in / Sign up

Export Citation Format

Share Document