finite mixture model
Recently Published Documents


TOTAL DOCUMENTS

167
(FIVE YEARS 39)

H-INDEX

18
(FIVE YEARS 2)

Entropy ◽  
2021 ◽  
Vol 23 (11) ◽  
pp. 1503
Author(s):  
Shunki Kyoya ◽  
Kenji Yamanishi

Finite mixture models are widely used for modeling and clustering data. When they are used for clustering, they are often interpreted by regarding each component as one cluster. However, this assumption may be invalid when the components overlap. It leads to the issue of analyzing such overlaps to correctly understand the models. The primary purpose of this paper is to establish a theoretical framework for interpreting the overlapping mixture models by estimating how they overlap, using measures of information such as entropy and mutual information. This is achieved by merging components to regard multiple components as one cluster and summarizing the merging results. First, we propose three conditions that any merging criterion should satisfy. Then, we investigate whether several existing merging criteria satisfy the conditions and modify them to fulfill more conditions. Second, we propose a novel concept named clustering summarization to evaluate the merging results. In it, we can quantify how overlapped and biased the clusters are, using mutual information-based criteria. Using artificial and real datasets, we empirically demonstrate that our methods of modifying criteria and summarizing results are effective for understanding the cluster structures. We therefore give a new view of interpretability/explainability for model-based clustering.


Author(s):  
Seuk Yen Phoong ◽  
Seuk Wai Phoong

The mixture model is known as model-based clustering that is used to model a mixture of unknown distributions. The clustering of mixture model is based on four important criteria, including the number of components in the mixture model, clustering kernel (such as Gaussian mixture models, Dirichlet, etc.), estimation methods, and dimensionality (Lai et al., 2019). Finite mixture model is a finite dimensional of a hierarchical model. It is useful in modeling the data with outliers, non-normal distributed or heavy tails. Furthermore, finite mixture model is flexible when fitted with the models that have multiple modes or skewed distribution. The flexibility depends on the increasing number of parameters with the existence of a number of components. The finite mixture model is a flexible model family and widely applied for large heterogeneous datasets. In addition, the finite mixture model is a probabilistic model that is used to examine the presence of unobserved situations or groups and to measure the distinct parameters or distribution. The situations, such as trend, seasoning, crisis time, normal situation, etc., might affect the number of components that exist for a probabilistic distribution. Furthermore, the finite mixture model is essential for time series data because these data exhibit nonlinearity properties and may have missing data or a jump-diffusion situation (Gensler, 2017; McLachlan and Lee, 2019). Keywords: Bayesian method; Finite Mixture Model; Maximum Likelihood Estimation; Prior distribution; Likelihood Function.


2021 ◽  
Vol 58 (3) ◽  
pp. 794-804
Author(s):  
Ebrahim Amini-Seresht ◽  
Narayanaswamy Balakrishnan

AbstractIn this paper we consider a new generalized finite mixture model formed by dependent and identically distributed (d.i.d.) components. We then establish results for the comparisons of lifetimes of two such generalized finite mixture models in two different cases: (i) when the two mixture models are formed from two random vectors $\textbf{X}$ and $\textbf{Y}$ but with the same weights, and (ii) when the two mixture models are formed with the same random vectors but with different weights. Because the lifetimes of k-out-of-n systems and coherent systems are special cases of the mixture model considered, we used the established results to compare the lifetimes of k-out-of-n systems and coherent systems with respect to the reversed hazard rate and hazard rate orderings.


Signals ◽  
2021 ◽  
Vol 2 (3) ◽  
pp. 527-539
Author(s):  
Mahdi Rezapour ◽  
Khaled Ksaibati

Vulnerable traffic users, such as bikers and pedestrians, account for a significant number of fatalities on the roadways. Extensive research has been conducted in the literature review to identify factors to those crashes. Studying factors to those crashes is especially important in the Western state in the US, due to one of the highest fatality rates in the nation and its unique geographic conditions. The first step in identifying factors to the severity of cyclist crashes is to find the underlying factors to that type of crash, while accounting for the heterogeneity in the dataset. Various techniques such as mixed parameter or mixed effect models have been employed in the literature to account for the heterogeneity of the dataset. In the mixed effect model, often the random effect parameter has been assigned subjectively, and based on some attributes and engineering intuitions. Those assignments are expected to account for the heterogeneity in the dataset and enhancement of the model fit. However, a question might arise whether those factors could account for an optimum amount of the heterogeneity in the dataset. A more reasonable way might be to let the algorithm such as the finite mixture model (FMM) to identify those clusters based on parameters of the Gaussian model, means and covariance matrices of the dataset, and allocate each observation to the related clusters. Thus, in this study, to capture optimum amount of heterogeneity, first we implemented the finite mixture model in the context of maximum likelihood, due the label switching issue of the method in the context of the Bayesian method. After assignment of the parameters to the observation, the main method of Hamiltonian Monte Carlo (HMC) with random effect was implemented. The results highlighted a significant improvement in the model fit, in terms of Widely Applicable Information Criterion (WAIC). The results of this study highlighted factors such as older biker age, increased number of lanes, nighttime travelling, increased posted speed limit and driving while under emotional conditions are some factors contributing to an increased severity of bikers’ crash severity. Extensive discussion has been made regarding the methodological algorithms and model parameters estimations.


Risks ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 76
Author(s):  
Jackie Li ◽  
Atsuyuki Kogure

Although a large number of mortality projection models have been proposed in the literature, relatively little attention has been paid to a formal assessment of the effect of model uncertainty. In this paper, we construct a Bayesian framework for embedding more than one mortality projection model and utilise the finite mixture model concept to allow for the blending of model structures. Under this framework, the varying features of different model structures can be exploited jointly and coherently to have a more detailed description of the underlying mortality patterns. We show that the proposed Bayesian approach performs well in fitting and forecasting Japanese mortality.


Signals ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. 41-52
Author(s):  
Mahdi Rezapour ◽  
Khaled Ksaibati

Various techniques have been proposed in the literature to account for the observed and unobserved heterogeneity in the crash dataset. Those include techniques such as the finite mixture model (FMM), or hierarchical techniques. The FMM could provide a flexible framework by providing various distributions for various individual observations. However, the shortcoming of the standard FMM is that it cannot account for the heterogeneity in a single model’s structure, and the data needs to be disaggregated to its resultant subsamples. That would result in a loss of information. On the other hand, a second plausible approach is to use a hierarchical technique to account for the data heterogeneities, being based on various explanatory variables, and based on engineering intuition. In the context of traffic safety, while some researchers, for instance, considered the seasonality, some others considered highway systems or even genders. However, a question might arise: are the same observations within a same hierarchy homogenous? Are all the observations within different clusters heterogeneous? Additionally, how about other variables? Although the results in the literature highlighted accounting for the structure of the dataset would result in an acceptable interclass correlation (ICC), and also result in a significant improvement in terms of reduction in the deviance information criteria (DIC), there is no justification why to use those specific hierarchies and reject others. A more reasonable approach is to let the algorithm come up with the best distributions based on the provided parameters and accommodate observations to the related mixtures. In that approach those observations that belong to various subjective hierarchies, e.g., winter versus summer, but found to be similar would be set in a similar cluster. That is why we proposed this methodology to implement an objective hierarchy of the FMM to be used for the hierarchical technique. Here, due to the label switching problem of the FMM in the context of Bayesian, the FMM first conducted in the context of maximum likelihood estimates, and then assigned observations were used for the final analysis. The results of the DIC highlighted a significant improvement in the model fit compared with a subjective assigned hierarchy based on highway system. Additionally, although the subjective model resulted in a very low ICC due to so much heterogeneity in the dataset, the implemented methodology resulted in an acceptable ICC (0.3), justifying the use of hierarchy. The Bayesian hierarchical finite mixture model (BHFMM) is one of earliest application in traffic safety studies. The findings of this study have important implications for the future studies to account for a higher heterogeneity of the crash dataset based on the distance of observations to each cluster.


2021 ◽  
Vol 67 (1) ◽  
pp. 171-185
Author(s):  
Mohammad Masroor Ahmed ◽  
Saleh Al Shehri ◽  
Jawad Osman Arshed ◽  
Mahmood Ul Hassan ◽  
Muzammil Hussain ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document