dirichlet process prior
Recently Published Documents


TOTAL DOCUMENTS

51
(FIVE YEARS 13)

H-INDEX

13
(FIVE YEARS 0)

2021 ◽  
Vol 13 (3) ◽  
pp. 75
Author(s):  
Yuexuan Zhao ◽  
Jing Huang

Graph variational auto-encoder (GVAE) is a model that combines neural networks and Bayes methods, capable of deeper exploring the influential latent features of graph reconstruction. However, several pieces of research based on GVAE employ a plain prior distribution for latent variables, for instance, standard normal distribution (N(0,1)). Although this kind of simple distribution has the advantage of convenient calculation, it will also make latent variables contain relatively little helpful information. The lack of adequate expression of nodes will inevitably affect the process of generating graphs, which will eventually lead to the discovery of only external relations and the neglect of some complex internal correlations. In this paper, we present a novel prior distribution for GVAE, called Dirichlet process (DP) construction for Student’s t (St) distribution. The DP allows the latent variables to adapt their complexity during learning and then cooperates with heavy-tailed St distribution to approach sufficient node representation. Experimental results show that this method can achieve a relatively better performance against the baselines.


2020 ◽  
pp. 096228022096563
Author(s):  
Bret Zeldow ◽  
James Flory ◽  
Alisa Stephens-Shields ◽  
Marsha Raebel ◽  
Jason A Roy

We develop a method to estimate subject-level trajectory functions from longitudinal data. The approach can be used for patient phenotyping, feature extraction, or, as in our motivating example, outcome identification, which refers to the process of identifying disease status through patient laboratory tests rather than through diagnosis codes or prescription information. We model the joint distribution of a continuous longitudinal outcome and baseline covariates using an enriched Dirichlet process prior. This joint model decomposes into (local) semiparametric linear mixed models for the outcome given the covariates and simple (local) marginals for the covariates. The nonparametric enriched Dirichlet process prior is placed on the regression and spline coefficients, the error variance, and the parameters governing the predictor space. This leads to clustering of patients based on their outcomes and covariates. We predict the outcome at unobserved time points for subjects with data at other time points as well as for new subjects with only baseline covariates. We find improved prediction over mixed models with Dirichlet process priors when there are a large number of covariates. Our method is demonstrated with electronic health records consisting of initiators of second-generation antipsychotic medications, which are known to increase the risk of diabetes. We use our model to predict laboratory values indicative of diabetes for each individual and assess incidence of suspected diabetes from the predicted dataset.


2020 ◽  
Author(s):  
Shai He ◽  
Aaron Schein ◽  
Vishal Sarsani ◽  
Patrick Flaherty

There are distinguishing features or “hallmarks” of cancer that are found across tumors, individuals, and types of cancer, and these hallmarks can be driven by specific genetic mutations. Yet, within a single tumor there is often extensive genetic heterogeneity as evidenced by single-cell and bulk DNA sequencing data. The goal of this work is to jointly infer the underlying genotypes of tumor subpopulations and the distribution of those subpopulations in individual tumors by integrating single-cell and bulk sequencing data. Understanding the genetic composition of the tumor at the time of treatment is important in the personalized design of targeted therapeutic combinations and monitoring for possible recurrence after treatment.We propose a hierarchical Dirichlet process mixture model that incorporates the correlation structure induced by a structured sampling arrangement and we show that this model improves the quality of inference. We develop a representation of the hierarchical Dirichlet process prior as a Gamma-Poisson hierarchy and we use this representation to derive a fast Gibbs sampling inference algorithm using the augment-and-marginalize method. Experiments with simulation data show that our model outperforms standard numerical and statistical methods for decomposing admixed count data. Analyses of real acute lymphoblastic leukemia cancer sequencing dataset shows that our model improves upon state-of-the-art bioinformatic methods. An interpretation of the results of our model on this real dataset reveals co-mutated loci across samples.


Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 948
Author(s):  
Stefano Cabras

The variable selection problem in general, and specifically for the ordinary linear regression model, is considered in the setup in which the number of covariates is large enough to prevent the exploration of all possible models. In this context, Gibbs-sampling is needed to perform stochastic model exploration to estimate, for instance, the model inclusion probability. We show that under a Bayesian non-parametric prior model for analyzing Gibbs-sampling output, the usual empirical estimator is just the asymptotic version of the expected posterior inclusion probability given the simulation output from Gibbs-sampling. Other posterior conditional estimators of inclusion probabilities can also be considered as related to the latent probabilities distributions on the model space which can be sampled given the observed Gibbs-sampling output. This paper will also compare, in this large model space setup the conventional prior approach against the non-local prior approach used to define the Bayes Factors for model selection. The approach is exposed along with simulation samples and also an application of modeling the Travel and Tourism factors all over the world.


2020 ◽  
pp. 1471082X2093976
Author(s):  
Meredith A. Ray ◽  
Dale Bowman ◽  
Ryan Csontos ◽  
Roy B. Van Arsdale ◽  
Hongmei Zhang

Earthquakes are one of the deadliest natural disasters. Our study focuses on detecting temporal patterns of earthquakes occurring along intraplate faults in the New Madrid seismic zone (NMSZ) within the middle of the United States from 1996–2016. Based on the magnitude and location of each earthquake, we developed a Bayesian clustering method to group hypocentres such that each group shared the same temporal pattern of occurrence. We constructed a matrix-variate Dirichlet process prior to describe temporal trends in the space and to detect regions showing similar temporal patterns. Simulations were conducted to assess accuracy and performance of the proposed method and to compare to other commonly used clustering methods such as Kmean, Kmedian and partition-around-medoids. We applied the method to NMSZ data to identify clusters of temporal patterns, which represent areas of stress that are potentially migrating over time. This information can then be used to assist in the prediction of future earthquakes.


Sign in / Sign up

Export Citation Format

Share Document