scholarly journals Learning Plackett-Luce Mixtures from Partial Preferences

Author(s):  
Ao Liu ◽  
Zhibing Zhao ◽  
Chao Liao ◽  
Pinyan Lu ◽  
Lirong Xia

We propose an EM-based framework for learning Plackett-Luce model and its mixtures from partial orders. The core of our framework is the efficient sampling of linear extensions of partial orders under Plackett-Luce model. We propose two Markov Chain Monte Carlo (MCMC) samplers: Gibbs sampler and the generalized repeated insertion method tuned by MCMC (GRIM-MCMC), and prove the efficiency of GRIM-MCMC for a large class of preferences.Experiments on synthetic data show that the algorithm with Gibbs sampler outperforms that with GRIM-MCMC. Experiments on real-world data show that the likelihood of test dataset increases when (i) partial orders provide more information; or (ii) the number of components in mixtures of PlackettLuce model increases.

2021 ◽  
Vol 15 (4) ◽  
pp. 1-46
Author(s):  
Kui Yu ◽  
Lin Liu ◽  
Jiuyong Li

In this article, we aim to develop a unified view of causal and non-causal feature selection methods. The unified view will fill in the gap in the research of the relation between the two types of methods. Based on the Bayesian network framework and information theory, we first show that causal and non-causal feature selection methods share the same objective. That is to find the Markov blanket of a class attribute, the theoretically optimal feature set for classification. We then examine the assumptions made by causal and non-causal feature selection methods when searching for the optimal feature set, and unify the assumptions by mapping them to the restrictions on the structure of the Bayesian network model of the studied problem. We further analyze in detail how the structural assumptions lead to the different levels of approximations employed by the methods in their search, which then result in the approximations in the feature sets found by the methods with respect to the optimal feature set. With the unified view, we can interpret the output of non-causal methods from a causal perspective and derive the error bounds of both types of methods. Finally, we present practical understanding of the relation between causal and non-causal methods using extensive experiments with synthetic data and various types of real-world data.


Order ◽  
2017 ◽  
Vol 35 (3) ◽  
pp. 403-420 ◽  
Author(s):  
Colin McDiarmid ◽  
David Penman ◽  
Vasileios Iliopoulos

2021 ◽  
Author(s):  
◽  
Timothy Sherry

<p>An online convolutive blind source separation solution has been developed for use in reverberant environments with stationary sources. Results are presented for simulation and real world data. The system achieves a separation SINR of 16.8 dB when operating on a two source mixture, with a total acoustic delay was 270 ms. This is on par with, and in many respects outperforms various published algorithms [1],[2]. A number of instantaneous blind source separation algorithms have been developed, including a block wise and recursive ICA algorithm, and a clustering based algorithm, able to obtain up to 110 dB SIR performance. The system has been realised in both Matlab and C, and is modular, allowing for easy update of the ICA algorithm that is the core of the unmixing process.</p>


2019 ◽  
Vol 7 (1) ◽  
pp. 13-27
Author(s):  
Safaa K. Kadhem ◽  
Sadeq A. Kadhim

"This paper aims at the modeling the crashes count in Al Muthanna governance using finite mixture model. We use one of the most common MCMC method which is called the Gibbs sampler to implement the Bayesian inference for estimating the model parameters. We perform a simulation study, based on synthetic data, to check the ability of the sampler to find the best estimates of the model. We use the two well-known criteria, which are the AIC and BIC, to determine the best model fitted to the data. Finally, we apply our sampler to model the crashes count in Al Muthanna governance.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Yeongbae Choe ◽  
Daniel R. Fesenmaier

PurposeThe purpose of this paper is to describe the core of an advanced destination management system, which uses a series of data matching techniques and business analytics.Design/methodology/approachThis study first proposes the conceptual framework for an advanced destination management system and then illustrates the core components of the proposed system using real-world data from Northern Indiana. In this study, search interests, devices used and other forms of website use derived from online clickstream data were merged with visitor demographic and tripographic information obtained from an online survey to develop an analytic model used to describe the core market structure.FindingsKey demographic factors (e.g. gender, age and income), search interests, referred websites, the number of total sessions, temporal aspects and spatial aspects of visitor travel provide essential information defining the structure and dynamics of the visitor marketing in Northern Indiana.Originality/valueThe process and data used in this study provide a “proof of concept” for developing highly personalized marketing systems, which can substantially improve the competitiveness of a destination management organization.


Stats ◽  
2020 ◽  
Vol 3 (2) ◽  
pp. 120-136
Author(s):  
Ersin Yılmaz ◽  
Syed Ejaz Ahmed ◽  
Dursun Aydın

This paper aims to solve the problem of fitting a nonparametric regression function with right-censored data. In general, issues of censorship in the response variable are solved by synthetic data transformation based on the Kaplan–Meier estimator in the literature. In the context of synthetic data, there have been different studies on the estimation of right-censored nonparametric regression models based on smoothing splines, regression splines, kernel smoothing, local polynomials, and so on. It should be emphasized that synthetic data transformation manipulates the observations because it assigns zero values to censored data points and increases the size of the observations. Thus, an irregularly distributed dataset is obtained. We claim that adaptive spline (A-spline) regression has the potential to deal with this irregular dataset more easily than the smoothing techniques mentioned here, due to the freedom to determine the degree of the spline, as well as the number and location of the knots. The theoretical properties of A-splines with synthetic data are detailed in this paper. Additionally, we support our claim with numerical studies, including a simulation study and a real-world data example.


2011 ◽  
Vol 22 (1) ◽  
pp. 31-38 ◽  
Author(s):  
Xiaochun Li ◽  
Changyu Shen

We review ideas, approaches and progress in the field of record linkage. We point out that the latent class models used in probabilistic matching have been well developed and applied in a different context of diagnostic testing when the true disease status is unknown. The methodology developed in the diagnostic testing setting can be potentially translated and applied in record linkage. Although there are many methods for record linkage, a comprehensive evaluation of methods for a wide range of real-world data with different data characteristics and with true match status is absent due to lack of data sharing. However, the recent availability of generators of synthetic data with realistic characteristics renders such evaluations feasible.


2012 ◽  
Vol 529 ◽  
pp. 585-589
Author(s):  
Wei Shao ◽  
Guo Qing Zhao ◽  
Yu Jie Gai

Gibbs sampler is widely used in Bayesian analysis. But it is often difficult to sample from the full conditional distribution, and this hardly weakens the efficiency of Gibbs sampler. In this paper, we propose to use mixture normal distribution for Gibbs sampler. The mixture normal distribution can approximate the target distribution. So carrying more information from target distribution, the mixture normal distribution tremendously improves the efficiency of Gibbs sampler. Further more, combining with mixture normal method, Hit-and-Run algorithm can also get more efficient sampling results. Simulation results show that Gibbs sampler with mixture normal distribution outperforms other sampling algorithms. The Gibbs sampler with mixture normal distribution can also be applied to explorer the surface of single crystal.


Author(s):  
TIANHAO ZHANG ◽  
XUELONG LI ◽  
DACHENG TAO ◽  
JIE YANG

Manifold learning has been demonstrated as an effective way to represent intrinsic geometrical structure of samples. In this paper, a new manifold learning approach, named Local Coordinates Alignment (LCA), is developed based on the alignment technique. LCA first obtains local coordinates as representations of local neighborhood by preserving proximity relations on a patch, which is Euclidean. Then, these extracted local coordinates are aligned to yield the global embeddings. To solve the out of sample problem, linearization of LCA (LLCA) is proposed. In addition, in order to solve the non-Euclidean problem in real world data when building the locality, kernel techniques are utilized to represent similarity of the pairwise points on a local patch. Empirical studies on both synthetic data and face image sets show effectiveness of the developed approaches.


Sign in / Sign up

Export Citation Format

Share Document