mixture model
Recently Published Documents





2022 ◽  
Stephen Coleman ◽  
Xaquin Castro Dopico ◽  
Gunilla B Karlsson Hedestam ◽  
Paul DW Kirk ◽  
Chris Wallace

Systematic differences between batches of samples present significant challenges when analysing biological data. Such batch effects are well-studied and are liable to occur in any setting where multiple batches are assayed. Many existing methods for accounting for these have focused on high-dimensional data such as RNA-seq and have assumptions that reflect this. Here we focus on batch-correction in low-dimensional classification problems. We propose a semi-supervised Bayesian generative classifier based on mixture models that jointly predicts class labels and models batch effects. Our model allows observations to be probabilistically assigned to classes in a way that incorporates uncertainty arising from batch effects. We explore two choices for the within-class densities: the multivariate normal and the multivariate t. A simulation study demonstrates that our method performs well compared to popular off-the-shelf machine learning methods and is also quick; performing 15,000 iterations on a dataset of 500 samples with 2 measurements each in 7.3 seconds for the MVN mixture model and 11.9 seconds for the MVT mixture model. We apply our model to two datasets generated using the enzyme-linked immunosorbent assay (ELISA), a spectrophotometric assay often used to screen for antibodies. The examples we consider were collected in 2020 and measure seropositivity for SARS-CoV-2. We use our model to estimate seroprevalence in the populations studied. We implement the models in C++ using a Metropolis-within-Gibbs algorithm; this is available in the R package at https://github.com/stcolema/BatchMixtureModel. Scripts to recreate our analysis are at https://github.com/stcolema/BatchClassifierPaper.

2022 ◽  
Vol 31 ◽  
pp. 570
Benjamin Kane ◽  
Will Gantt ◽  
Aaron Steven White

We investigate which patterns of lexically triggered doxastic, bouletic, neg(ation)-raising, and veridicality inferences are (un)attested across clause-embedding verbs in English. To carry out this investigation, we use a multiview mixed effects mixture model to discover the inference patterns captured in three lexicon-scale inference judgment datasets: two existing datasets, MegaVeridicality and MegaNegRaising, which capture veridicality and neg-raising inferences across a wide swath of the English clause-embedding lexicon, and a new dataset, MegaIntensionality, which similarly captures doxastic and bouletic inferences. We focus in particular on inference patterns that are correlated with morphosyntactic distribution, as determined by how well those patterns predict the acceptability judgments in the MegaAcceptability dataset. We find that there are 15 such patterns attested. Similarities among these patterns suggest the possibility of underlying lexical semantic components that give rise to them. We use principal component analysis to discover these components and suggest generalizations that can be derived from them.

2022 ◽  
Xiaodong Zhang ◽  
Anand Natarajan

Abstract. Uncertainty quantification is a necessary step in wind turbine design due to the random nature of the environmental loads, through which the uncertainty of structural loads and responses under specific situations can be quantified. Specifically, wind turbulence has a significant impact on the extreme and fatigue design envelope of the wind turbine. The wind parameters (mean and standard deviation of 10-minute wind speed) are usually not independent, and it will lead to biased results for structural reliability or uncertainty quantification assuming the wind parameters are independent. A proper probabilistic model should be established to model the correlation among wind parameters. Compared to univariate distributions, theoretical multivariate distributions are limited and not flexible enough to model the wind parameters from different sites or direction sectors. Copula-based models are used often for correlation description, but existing parametric copulas may not model the correlation among wind parameters well due to limitations of the copula structures. The Gaussian mixture model is widely applied for density estimation and clustering in many domains, but limited studies were conducted in wind energy and few used it for density estimation of wind parameters. In this paper, the Gaussian mixture model is used to model the joint distribution of mean and standard deviation of 10-minute wind speed, which is calculated from 15 years of wind measurement time series data. As a comparison, the Nataf transformation (Gaussian copula) and Gumbel copula are compared with the Gaussian mixture model in terms of the estimated marginal distributions and conditional distributions. The Gaussian mixture model is then adopted to estimate the extreme wind turbulence, which could be taken as an input to design loads used in the ultimate design limit state of turbine structures. The wind turbulence associated with a 50-year return period computed from the Gaussian mixture model is compared with what is utilized in the design of wind turbines as given in the IEC 61400-1.

IEEE Access ◽  
2022 ◽  
pp. 1-1
Shahaf E. Finder ◽  
Eran Treister ◽  
Oren Freifeld

2022 ◽  
Vol 32 (1) ◽  
pp. 361-375
S. Markkandan ◽  
S. Sivasubramanian ◽  
Jaison Mulerikkal ◽  
Nazeer Shaik ◽  
Beulah Jackson ◽  

Sign in / Sign up

Export Citation Format

Share Document