consensus partition
Recently Published Documents


TOTAL DOCUMENTS

12
(FIVE YEARS 3)

H-INDEX

5
(FIVE YEARS 1)

Author(s):  
Tao Sun ◽  
Saeed Mashdour ◽  
Mohammad Reza Mahmoudi

Clustering ensemble is a new problem where it is aimed to extract a clustering out of a pool of base clusterings. The pool of base clusterings is sometimes referred to as ensemble. An ensemble is to be considered to be a suitable one, if its members are diverse and any of them has a minimum quality. The method that maps an ensemble into an output partition (called also as consensus partition) is named consensus function. The consensus function should find a consensus partition that all of the ensemble members agree on it as much as possible. In this paper, a novel clustering ensemble framework that guarantees generation of a pool of the base clusterings with the both conditions (diversity among ensemble members and high-quality members) is introduced. According to its limitations, a novel consensus function is also introduced. We experimentally show that the proposed clustering ensemble framework is scalable, efficient and general. Using different base clustering algorithms, we show that our improved base clustering algorithm is better. Also, among different consensus functions, we show the effectiveness of our consensus function. Finally, comparing with the state of the art, we find that the clustering ensemble framework is comparable or even better in terms of scalability and efficacy.


2020 ◽  
Vol 176 (1) ◽  
pp. 79-102
Author(s):  
Chenyue Zhao ◽  
Hosein Alizadeh ◽  
Behrouz Minaei ◽  
Majid Mohamadpoor ◽  
Hamid Parvin ◽  
...  

This paper studies the cluster ensemble selection problem for unsupervised learning. Given a large ensemble of clustering solutions, our goal is to select a subset of solutions to form a smaller yet better performing cluster ensemble than using all available solutions. The common way of aggregating the chosen solutions is accumulating the information of the selected results to a similarity matrix. This paper suggests transforming the similarity matrix to a modularity matrix and then applying a new consensus function which optimizes modularity measure in it. We represent the modularity maximization problem as a 0-1 quadratic program which can be exactly solved for small datasets. We also established a new greedy algorithm, namely sum linkage, to optimize the objective function specially for large scale datasets in a very short time. We show that the proposed consensus partition gets much closer to the actual cluster structure than the partitions obtained from the direct application of common cluster ensemble methods. The promising results compared with other most cited consensus functions show the excellent efficiency of the proposed method.


Author(s):  
Siwei Wang ◽  
Xinwang Liu ◽  
En Zhu ◽  
Chang Tang ◽  
Jiyuan Liu ◽  
...  

Multi-view clustering (MVC) optimally integrates complementary information from different views to improve clustering performance. Although demonstrating promising performance in many applications, we observe that most of existing methods directly combine multiple views to learn an optimal similarity for clustering. These methods would cause intensive computational complexity and over-complicated optimization. In this paper, we theoretically uncover the connection between existing k-means clustering and the alignment between base partitions and consensus partition. Based on this observation, we propose a simple but effective multi-view algorithm termed {Multi-view Clustering via Late Fusion Alignment Maximization (MVC-LFA)}. In specific, MVC-LFA proposes to maximally align the consensus partition with the weighted base partitions. Such a criterion is beneficial to significantly reduce the computational complexity and simplify the optimization procedure. Furthermore, we design a three-step iterative algorithm to solve the new resultant optimization problem with theoretically guaranteed convergence. Extensive experiments on five multi-view benchmark datasets demonstrate the effectiveness and efficiency of the proposed MVC-LFA.


2017 ◽  
Vol 26 (04) ◽  
pp. 1750018
Author(s):  
Mohamed Ali Zoghlami ◽  
Minyar Sassi Hidri ◽  
Rahma Ben Ayed

Consensus clustering is used in data analysis to generate stable results out of a set of partitions delivered by stochastic methods. Typically, the goal is searching for the socalled median (or consensus) partition, i.e. the partition that is most similar, on average, to all the input partitions. In this paper we address the problem of combining multiple fuzzy clusterings without access to the underlying features of the data while basing on inter-clusters similarity. We are concerned of top-down and bottom-up based consensus-driven fuzzy clustering while splitting and merging worst clusters. The objective is to reconcile a structure, developed for patterns in some dataset with the structural findings already available for other related ones. The proposed classifiers consider dispersion and dissimilarity between the partitions as well as the corresponding fuzzy proximity matrices. Several illustrative numerical examples, using both synthetic data and those coming from available machine learning repositories, are also included. The experimental component of the study shows the efficiency of the proposed classifiers in terms of quality and runtime.


Author(s):  
Zhiqiang Tao ◽  
Hongfu Liu ◽  
Sheng Li ◽  
Zhengming Ding ◽  
Yun Fu

Multi-View Clustering (MVC) aims to find the cluster structure shared by multiple views of a particular dataset. Existing MVC methods mainly integrate the raw data from different views, while ignoring the high-level information. Thus, their performance may degrade due to the conflict between heterogeneous features and the noises existing in each individual view. To overcome this problem, we propose a novel Multi-View Ensemble Clustering (MVEC) framework to solve MVC in an Ensemble Clustering (EC) way, which generates Basic Partitions (BPs) for each view individually and seeks for a consensus partition among all the BPs. By this means, we naturally leverage the complementary information of multi-view data in the same partition space. Instead of directly fusing BPs, we employ the low-rank and sparse decomposition to explicitly consider the connection between different views and detect the noises in each view. Moreover, the spectral ensemble clustering task is also involved by our framework with a carefully designed constraint, making MVEC a unified optimization framework to achieve the final consensus partition. Experimental results on six real-world datasets show the efficacy of our approach compared with both MVC and EC methods.


2013 ◽  
Vol 10 (81) ◽  
pp. 20120990 ◽  
Author(s):  
Basel Abu-Jamous ◽  
Rui Fa ◽  
David J. Roberts ◽  
Asoke K. Nandi

The binarization of consensus partition matrices (Bi-CoPaM) method has, among its unique features, the ability to perform ensemble clustering over the same set of genes from multiple microarray datasets by using various clustering methods in order to generate tunable tight clusters. Therefore, we have used the Bi-CoPaM method to the most synchronized 500 cell-cycle-regulated yeast genes from different microarray datasets to produce four tight, specific and exclusive clusters of co-expressed genes. We found 19 genes formed the tightest of the four clusters and this included the gene CMR1/YDL156W, which was an uncharacterized gene at the time of our investigations. Two very recent proteomic and biochemical studies have independently revealed many facets of CMR1 protein, although the precise functions of the protein remain to be elucidated. Our computational results complement these biological results and add more evidence to their recent findings of CMR1 as potentially participating in many of the DNA-metabolism processes such as replication, repair and transcription. Interestingly, our results demonstrate the close co-expressions of CMR1 and the replication protein A (RPA), the cohesion complex and the DNA polymerases α , δ and ɛ , as well as suggest functional relationships between CMR1 and the respective proteins. In addition, the analysis provides further substantial evidence that the expression of the CMR1 gene could be regulated by the MBF complex. In summary, the application of a novel analytic technique in large biological datasets has provided supporting evidence for a gene of previously unknown function, further hypotheses to test, and a more general demonstration of the value of sophisticated methods to explore new large datasets now so readily generated in biological experiments.


PLoS ONE ◽  
2013 ◽  
Vol 8 (2) ◽  
pp. e56432 ◽  
Author(s):  
Basel Abu-Jamous ◽  
Rui Fa ◽  
David J. Roberts ◽  
Asoke K. Nandi

Sign in / Sign up

Export Citation Format

Share Document