INDEPENDENCE IN COMPLETE AND INCOMPLETE CAUSAL NETWORKS UNDER MAXIMUM ENTROPY

Author(s):  
MICHAEL J. MARKHAM

In an expert system having a consistent set of linear constraints it is known that the Method of Tribus may be used to determine a probability distribution which exhibits maximised entropy. The method is extended here to include independence constraints (Accommodation). The paper proceeds to discusses this extension, and its limitations, then goes on to advance a technique for determining a small set of independencies which can be added to the linear constraints required in a particular representation of an expert system called a causal network, so that the Maximum Entropy and Causal Networks methodologies give matching distributions (Emulation). This technique may also be applied in cases where no initial independencies are given and the linear constraints are incomplete, in order to provide an optimal ME fill-in for the missing information.

Author(s):  
MICHAEL J. MARKHAM ◽  
PAUL C. RHODES

The desire to use Causal Networks as Expert Systems even when the causal information is incomplete and/or when non-causal information is available has led researchers to look into the possibility of utilising Maximum Entropy. If this approach is taken, the known information is supplemented by maximising entropy to provide a unique initial probability distribution which would otherwise have been a consequence of the known information and the independence relationships implied by the network. Traditional maximising techniques can be used if the constraints are linear but the independence relationships give rise to non-linear constraints. This paper extends traditional maximising techniques to incorporate those types of non-linear constraints that arise from the independence relationships and presents an algorithm for implementing the extended method. Maximising entropy does not involve the concept of "causal" information. Consequently, the extended method will accept any mutually consistent set of conditional probabilities and expressions of independence. The paper provides a small example of how this property can be used to provide complete causal information, for use in a causal network, when the known information is incomplete and not in a causal form.


1984 ◽  
Vol R-33 (4) ◽  
pp. 353-357 ◽  
Author(s):  
James E. Miller ◽  
Richard W. Kulp ◽  
George E. Orr

1990 ◽  
Vol 27 (2) ◽  
pp. 303-313 ◽  
Author(s):  
Claudine Robert

The maximum entropy principle is used to model uncertainty by a maximum entropy distribution, subject to some appropriate linear constraints. We give an entropy concentration theorem (whose demonstration is based on large deviation techniques) which is a mathematical justification of this statistical modelling principle. Then we indicate how it can be used in artificial intelligence, and how relevant prior knowledge is provided by some classical descriptive statistical methods. It appears furthermore that the maximum entropy principle yields to a natural binding between descriptive methods and some statistical structures.


2014 ◽  
Vol 953-954 ◽  
pp. 458-461
Author(s):  
Yi Hui Zhang

Power from wind turbines is mainly related to the wind speed. Due to the influence of the uncertainty of the wind, intermittent and wind farm in units of the wake, wind power has fluctuations. Based on the field measurement, it is found that t location-scale distribution is suitable to identify the probability distribution of wind power variations. By analyzing the fluctuation of a single in different time intervals, we find that the distribution of wind power fluctuation possesses a certain trend pattern. With the length of the time window increasing, the fluctuations increase and some information has been missed. We define an index to calculate the quantity of missing information and can use that to evaluate whether a certain length of interval is acceptable.


2021 ◽  
Vol 118 (40) ◽  
pp. e2025782118
Author(s):  
Wei-Chia Chen ◽  
Juannan Zhou ◽  
Jason M. Sheltzer ◽  
Justin B. Kinney ◽  
David M. McCandlish

Density estimation in sequence space is a fundamental problem in machine learning that is also of great importance in computational biology. Due to the discrete nature and large dimensionality of sequence space, how best to estimate such probability distributions from a sample of observed sequences remains unclear. One common strategy for addressing this problem is to estimate the probability distribution using maximum entropy (i.e., calculating point estimates for some set of correlations based on the observed sequences and predicting the probability distribution that is as uniform as possible while still matching these point estimates). Building on recent advances in Bayesian field-theoretic density estimation, we present a generalization of this maximum entropy approach that provides greater expressivity in regions of sequence space where data are plentiful while still maintaining a conservative maximum entropy character in regions of sequence space where data are sparse or absent. In particular, we define a family of priors for probability distributions over sequence space with a single hyperparameter that controls the expected magnitude of higher-order correlations. This family of priors then results in a corresponding one-dimensional family of maximum a posteriori estimates that interpolate smoothly between the maximum entropy estimate and the observed sample frequencies. To demonstrate the power of this method, we use it to explore the high-dimensional geometry of the distribution of 5′ splice sites found in the human genome and to understand patterns of chromosomal abnormalities across human cancers.


2013 ◽  
Vol 42 (5) ◽  
pp. 2803-2819 ◽  
Author(s):  
Chien-Hua Peng ◽  
Yi-Zhi Jiang ◽  
An-Shun Tai ◽  
Chun-Bin Liu ◽  
Shih-Chi Peng ◽  
...  

Abstract Deciphering the causal networks of gene interactions is critical for identifying disease pathways and disease-causing genes. We introduce a method to reconstruct causal networks based on exploring phenotype-specific modules in the human interactome and including the expression quantitative trait loci (eQTLs) that underlie the joint expression variation of each module. Closely associated eQTLs help anchor the orientation of the network. To overcome the inherent computational complexity of causal network reconstruction, we first deduce the local causality of individual subnetworks using the selected eQTLs and module transcripts. These subnetworks are then integrated to infer a global causal network using a random-field ranking method, which was motivated by animal sociology. We demonstrate how effectively the inferred causality restores the regulatory structure of the networks that mediate lymph node metastasis in oral cancer. Network rewiring clearly characterizes the dynamic regulatory systems of distinct disease states. This study is the first to associate an RXRB-causal network with increased risks of nodal metastasis, tumor relapse, distant metastases and poor survival for oral cancer. Thus, identifying crucial upstream drivers of a signal cascade can facilitate the discovery of potential biomarkers and effective therapeutic targets.


2020 ◽  
Author(s):  
Wei-Chia Chen ◽  
Juannan Zhou ◽  
Jason M Sheltzer ◽  
Justin B Kinney ◽  
David M McCandlish

AbstractDensity estimation in sequence space is a fundamental problem in machine learning that is of great importance in computational biology. Due to the discrete nature and large dimensionality of sequence space, how best to estimate such probability distributions from a sample of observed sequences remains unclear. One common strategy for addressing this problem is to estimate the probability distribution using maximum entropy, i.e. calculating point estimates for some set of correlations based on the observed sequences and predicting the probability distribution that is as uniform as possible while still matching these point estimates. Building on recent advances in Bayesian field-theoretic density estimation, we present a generalization of this maximum entropy approach that provides greater expressivity in regions of sequence space where data is plentiful while still maintaining a conservative maximum entropy char-acter in regions of sequence space where data is sparse or absent. In particular, we define a family of priors for probability distributions over sequence space with a single hyper-parameter that controls the expected magnitude of higher-order correlations. This family of priors then results in a corresponding one-dimensional family of maximum a posteriori estimates that interpolate smoothly between the maximum entropy estimate and the observed sample frequencies. To demonstrate the power of this method, we use it to explore the high-dimensional geometry of the distribution of 5′ splice sites found in the human genome and to understand the accumulation of chromosomal abnormalities during cancer progression.


Sign in / Sign up

Export Citation Format

Share Document