conditional probability
Recently Published Documents


TOTAL DOCUMENTS

1234
(FIVE YEARS 209)

H-INDEX

49
(FIVE YEARS 5)

Author(s):  
Paula Hatum ◽  
Kathryn McMahon ◽  
Kerrie Mengersen ◽  
Paul Wu

Ecological models are extensively and increasingly used in support of environmental policy and decision making. Dynamic Bayesian Networks (DBN) as a tool for conservation have been demonstrated to be a valuable tool for providing a systematic and intuitive approach to integrating data and other critical information to help guide the decision-making process. However, data for a new ecosystem are often sparse. In this case, a general DBN developed for similar ecosystems could be applicable, but this may require the adaptation of key elements of the network. The research presented in this paper focused on a case study to identify and implement guidelines for model adaptation. We adapted a general DBN of a seagrass ecosystem to a new location where nodes were similar, but the conditional probability tables varied. We focused on two species of seagrass (Zostera noltei and Z. marina) located in Arcachon Bay, France. Expert knowledge was used to complement peer-reviewed literature to identify which components needed adjustment including parameterisation and quantification of the model and desired outcomes. We adopted both linguistic labels and scenario-based elicitation to elicit from experts the conditional probabilities used to quantify the DBN. Following the proposed guidelines, the model structure of the DBN was retained, but the conditional probability tables were adapted for nodes that characterised the growth dynamics in Zostera spp. population located in Arcachon Bay, as well as the seasonal variation on their reproduction. Particular attention was paid to the light variable as it is a crucial driver of growth and physiology for seagrasses. Our guidelines provide a way to adapt a general DBN to specific ecosystems to maximise model reuse and minimise re-development effort. Especially important from a transferability perspective are guidelines for ecosystems with limited data, and how simulation and prior predictive approaches can be used in these contexts.


2022 ◽  
Vol 14 (1) ◽  
pp. 327-357
Author(s):  
Michael Geruso ◽  
Dean Spears ◽  
Ishaana Talesara

Inversions—in which the popular vote winner loses the election— have occurred in four US presidential races. We show that rather than being statistical flukes, inversions have been ex ante likely since the early 1800s. In elections yielding a popular vote margin within 1 point (one-eighth of presidential elections), about 40 percent will be inversions in expectation. We show this conditional probability is remarkably stable across historical periods—despite differences in which groups voted, which states existed, and which parties participated. Our findings imply that the United States has experienced so few inversions merely because there have been so few elections (and fewer close elections). (JEL D72, N41, N42)


2021 ◽  
Vol 15 (1) ◽  
pp. 280-288
Author(s):  
Mahdi Rezapour ◽  
Khaled Ksaibati

Background: Kernel-based methods have gained popularity as employed model residual’s distribution might not be defined by any classical parametric distribution. Kernel-based method has been extended to estimate conditional densities instead of conditional distributions when data incorporate both discrete and continuous attributes. The method often has been based on smoothing parameters to use optimal values for various attributes. Thus, in case of an explanatory variable being independent of the dependent variable, that attribute would be dropped in the nonparametric method by assigning a large smoothing parameter, giving them uniform distributions so their variances to the model’s variance would be minimal. Objectives: The objective of this study was to identify factors to the severity of pedestrian crashes based on an unbiased method. Especially, this study was conducted to evaluate the applicability of kernel-based techniques of semi- and nonparametric methods on the crash dataset by means of confusion techniques. Methods: In this study, two non- and semi-parametric kernel-based methods were implemented to model the severity of pedestrian crashes. The estimation of the semi-parametric densities is based on the adoptive local smoothing and maximization of the quasi-likelihood function, which is similar somehow to the likelihood of the binary logit model. On the other hand, the nonparametric method is based on the selection of optimal smoothing parameters in estimation of the conditional probability density function to minimize mean integrated squared error (MISE). The performances of those models are evaluated by their prediction power. To have a benchmark for comparison, the standard logistic regression was also employed. Although those methods have been employed in other fields, this is one of the earliest studies that employed those techniques in the context of traffic safety. Results: The results highlighted that the nonparametric kernel-based method outperforms the semi-parametric (single-index model) and the standard logit model based on the confusion matrices. To have a vision about the bandwidth selection method for removal of the irrelevant attributes in nonparametric approach, we added some noisy predictors to the models and a comparison was made. Extensive discussion has been made in the content of this study regarding the methodological approach of the models. Conclusion: To summarize, alcohol and drug involvement, driving on non-level grade, and bad lighting conditions are some of the factors that increase the likelihood of pedestrian crash severity. This is one of the earliest studies that implemented the methods in the context of transportation problems. The nonparametric method is especially recommended to be used in the field of traffic safety when there are uncertainties regarding the importance of predictors as the technique would automatically drop unimportant predictors.


2021 ◽  
Author(s):  
Na Li ◽  
Shenglian Guo ◽  
Feng Xiong ◽  
Jun Wang ◽  
Yuzuo Xie

Abstract The coincidence of floods in the mainstream and its tributaries may lead to a large flooding in the downstream confluence area, and the flood coincidence risk analysis is very important for flood prevention and disaster reduction. In this study, the multiple regression model was used to establish the functional relationship among flood magnitudes in the mainstream and its tributaries. The mixed von Mises distribution and Pearson Type III distribution were selected to fit the probability distribution of the annual maximum flood occurrence dates and magnitudes, respectively. The joint distributions of the annual maximum flood occurrence dates and magnitudes were established using copula function, respectively. Fuhe River in the Poyang Lake region was selected as a study case. The joint probability, co-occurrence probability and conditional probability of flood magnitudes were quantitatively estimated and compared with the predicted flood coincidence risks. The results show that the selected marginal and joint distributions can fit observed flood dataset very well. The coincidence probabilities of flood occurrence dates in the upper mainstream and its tributaries mainly occur from May to early July. It is found that the conditional probability is the most consistent with the predicted flood coincidence risks in the mainstream and its tributaries, and is more reliable and rational in practice.


2021 ◽  
Vol 4 ◽  
pp. 56-59
Author(s):  
Anna Salii

Sometimes in practice it is necessary to calculate the probability of an uncertain cause, taking into account some observed evidence. For example, we would like to know the probability of a particular disease when we observe the patient’s symptoms. Such problems are often complex with many interrelated variables. There may be many symptoms and even more potential causes. In practice, it is usually possible to obtain only the inverse conditional probability, the probability of evidence giving the cause, the probability of observing the symptoms if the patient has the disease.Intelligent systems must think about their environment. For example, a robot needs to know about the possible outcomes of its actions, and the system of medical experts needs to know what causes what consequences. Intelligent systems began to use probabilistic methods to deal with the uncertainty of the real world. Instead of building a special system of probabilistic reasoning for each new program, we would like a common framework that would allow probabilistic reasoning in any new program without restoring everything from scratch. This justifies the relevance of the developed genetic algorithm. Bayesian networks, which first appeared in the work of Judas Pearl and his colleagues in the late 1980s, offer just such an independent basis for plausible reasoning.This article presents the genetic algorithm for learning the structure of the Bayesian network that searches the space of the graph, uses mutation and crossover operators. The algorithm can be used as a quick way to learn the structure of a Bayesian network with as few constraints as possible.learn the structure of a Bayesian network with as few constraints as possible.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Ali Labriji ◽  
Abdelkrim Bennar ◽  
Mostafa Rachik

The use of conditional probabilities has gained in popularity in various fields such as medicine, finance, and imaging processing. This has occurred especially with the availability of large datasets that allow us to extract the full potential of the available estimation algorithms. Nevertheless, such a large volume of data is often accompanied by a significant need for computational capacity as well as a consequent compilation time. In this article, we propose a low-cost estimation method: we first demonstrate analytically the convergence of our method to the desired probability and then we perform a simulation to support our point.


2021 ◽  
Author(s):  
Camilo E. Valderrama ◽  
Daniel J. Niven ◽  
Henry T. Stelfox ◽  
Joon Lee

BACKGROUND Redundancy in laboratory blood tests is common in intensive care units (ICU), affecting patients' health and increasing healthcare expenses. Medical communities have made recommendations to order laboratory tests more judiciously. Wise selection can rely on modern data-driven approaches that have been shown to help identify redundant laboratory blood tests in ICUs. However, most of these works have been developed for highly selected clinical conditions such as gastrointestinal bleeding. Moreover, features based on conditional entropy and conditional probability distribution have not been used to inform the need for performing a new test. OBJECTIVE We aimed to address the limitations of previous works by adapting conditional entropy and conditional probability to extract features to predict abnormal laboratory blood test results. METHODS We used an ICU dataset collected across Alberta, Canada which included 55,689 ICU admissions from 48,672 patients with different diagnoses. We investigated conditional entropy and conditional probability-based features by comparing the performances of two machine learning approaches to predict normal and abnormal results for 18 blood laboratory tests. Approach 1 used patients' vitals, age, sex, admission diagnosis, and other laboratory blood test results as features. Approach 2 used the same features plus the new conditional entropy and conditional probability-based features. RESULTS Across the 18 blood laboratory tests, both Approach 1 and Approach 2 achieved a median F1-score, AUC, precision-recall AUC, and Gmean above 80%. We found that the inclusion of the new features statistically significantly improved the capacity to predict abnormal laboratory blood test results in between ten and fifteen laboratory blood tests depending on the machine learning model. CONCLUSIONS Our novel approach with promising prediction results can help reduce over-testing in ICUs, as well as risks for patients and healthcare systems. CLINICALTRIAL N/A


Sign in / Sign up

Export Citation Format

Share Document