Ensemble Clustering Data Mining and Databases
Standard clustering algorithms employ fixed assumptions about data structure. For instance, the k-means algorithm is applicable for spherical and linearly separable data clouds. When the data come from multidimensional normal distribution – so-called EM algorithm can be applied. But in practice the assumptions underlying given set of observations are too complex to fit into a single assumption. We can split these assumptions into manageable hypothesis justifying the use of particular clustering algorithms. Then we must aggregate partial results into a meaningful description of our data. The consensus clustering do this task. In this article we clarify the idea of consensus clustering, and we present conceptual frames for such a compound analysis. Next the basic approaches to implement consensus procedure are given. Finally, some new directions in this field are mentioned.