bayesian mixture model
Recently Published Documents


TOTAL DOCUMENTS

80
(FIVE YEARS 23)

H-INDEX

13
(FIVE YEARS 3)

2021 ◽  
Vol 15 (6) ◽  
pp. 1-21
Author(s):  
Huandong Wang ◽  
Yong Li ◽  
Mu Du ◽  
Zhenhui Li ◽  
Depeng Jin

Both app developers and service providers have strong motivations to understand when and where certain apps are used by users. However, it has been a challenging problem due to the highly skewed and noisy app usage data. Moreover, apps are regarded as independent items in existing studies, which fail to capture the hidden semantics in app usage traces. In this article, we propose App2Vec, a powerful representation learning model to learn the semantic embedding of apps with the consideration of spatio-temporal context. Based on the obtained semantic embeddings, we develop a probabilistic model based on the Bayesian mixture model and Dirichlet process to capture when , where , and what semantics of apps are used to predict the future usage. We evaluate our model using two different app usage datasets, which involve over 1.7 million users and 2,000+ apps. Evaluation results show that our proposed App2Vec algorithm outperforms the state-of-the-art algorithms in app usage prediction with a performance gap of over 17.0%.


Author(s):  
Brina Miftahurrohmah ◽  
Catur Wulandari ◽  
Yogantara Setya Dharmawan

Background: Stock investment has been gaining momentum in the past years due to the development of technology. During the pandemic lockdown, people have invested more. One the one hand, stock investment has high potential profitability, but on the other, it is equally risky. Therefore, a value at risk (VaR) analysis is needed. One approach to calculate VaR is by using the Bayesian mixture model, which has been proven to be able to overcome heavy-tailed cases. Then, the VaR’s accuracy needs to be tested, and one of the ways is by using backtesting, such as the Kupiec test.Objective: This study aims to determine the VaR model of PT NFC Indonesia Tbk (NFCX) return data using Bayesian mixture modelling and backtesting. On a practical level, this study can provide information about the potential risks of investing that is grounded in empirical evidence.Methods: The data used was NFCX data retrieved from Yahoo Finance, which was then modelled with a mixture model based on the normal and Laplace distributions. After that, the VaR accuracy was calculated and then tested by using backtesting.Results: The test results showed that the VaR with the mixture Laplace autoregressive (MLAR) approach (2;[2],[4]) was accurate at 5% and 1% quantiles while mixture normal autoregressive MNAR (2;[2],[2,4]) was only accurate at 5% quantiles.Conclusion: The better performing NFCX VaR model for this study based on backtesting using Kupiec test is MLAR(2;[2],[4]).


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Emily Roberts ◽  
Lili Zhao

Abstract In regression models, predictor variables with inherent ordering, such ECOG performance status or novel biomarker expression levels, are commonly seen in medical settings. Statistically, it may be difficult to determine the functional form of an ordinal predictor variable. Often, such a variable is dichotomized based on whether it is above or below a certain cutoff. Other methods conveniently treat the ordinal predictor as a continuous variable and assume a linear relationship with the outcome. However, arbitrarily choosing a method may lead to inaccurate inference and treatment. In this paper, we propose a Bayesian mixture model to consider both dichotomous and linear forms for the variable. This allows for simultaneous assessment of the appropriate form of the predictor in regression models by considering the presence of a changepoint through the lens of a threshold detection problem. This method is applicable to continuous, binary, and survival outcomes, and it is easily amenable to penalized regression. We evaluated the proposed method using simulation studies and apply it to two real datasets. We provide JAGS code for easy implementation.


Author(s):  
Niko A. Kaciroti ◽  
Carey Lumeng ◽  
Vikas Parekh ◽  
Matthew L. Boulton

An outbreak of SARS-CoV-2 has led to a global pandemic affecting virtually every country. As of August 31, 2020, globally, there have been approximately 25,500,000 confirmed cases and 850,000 deaths; in the United States (50 states plus District of Columbia), there have been more than 6,000,000 confirmed cases and 183,000 deaths. We propose a Bayesian mixture model to predict and monitor COVID-19 mortality across the United States. The model captures skewed unimodal (prolonged recovery) or multimodal (multiple surges) curves. The results show that across all states, the first peak dates of mortality varied between April 4, 2020 for Alaska and June 18, 2020 for Arkansas. As of August 31, 2020, 31 states had a clear bimodal curve showing a strong second surge. The peak date for a second surge ranged from July 1, 2020 for Virginia to September 12, 2020 for Hawaii. The first peak for the United States occurred about April 16, 2020—dominated by New York and New Jersey—and a second peak on August 6, 2020—dominated by California, Texas, and Florida. Reliable models for predicting the COVID-19 pandemic are essential to informing resource allocation and intervention strategies. A Bayesian mixture model was able to more accurately predict the shape of the mortality curves across the United States than other models, including the timing of multiple peaks. However, given the dynamic nature of the pandemic, it is important that the results be updated regularly to identify and better monitor future waves, and characterize the epidemiology of the pandemic.


2021 ◽  
Author(s):  
Sierra N. Merkes ◽  
Scotland Leman ◽  
Eric Smith ◽  
Aaron Defreitas ◽  
William N. Alexander ◽  
...  

2020 ◽  
Vol 117 (32) ◽  
pp. 19339-19346 ◽  
Author(s):  
Ammon Thompson ◽  
Michael R. May ◽  
Brian R. Moore ◽  
Artyom Kopp

Transcriptomes are key to understanding the relationship between genotype and phenotype. The ability to infer the expression state (active or inactive) of genes in the transcriptome offers unique benefits for addressing this issue. For example, qualitative changes in gene expression may underly the origin of novel phenotypes, and expression states are readily comparable between tissues and species. However, inferring the expression state of genes is a surprisingly difficult problem, owing to the complex biological and technical processes that give rise to observed transcriptomic datasets. Here, we develop a hierarchical Bayesian mixture model that describes this complex process and allows us to infer expression state of genes from replicate transcriptomic libraries. We explore the statistical behavior of this method with analyses of simulated datasets—where we demonstrate its ability to correctly infer true (known) expression states—and empirical-benchmark datasets, where we demonstrate that the expression states inferred from RNA-sequencing (RNA-seq) datasets using our method are consistent with those based on independent evidence. The power of our method to correctly infer expression states is generally high and remarkably, approaches the maximum possible power for this inference problem. We present an empirical analysis of primate-brain transcriptomes, which identifies genes that have a unique expression state in humans. Our method is implemented in the freely available R package zigzag.


Sign in / Sign up

Export Citation Format

Share Document