Choosing the Number of Clusters, Subset Selection of Variables, and Outlier Detection in the Standard Mixture-Model Cluster Analysis

Author(s):  
Hamparsum Bozdogan
1990 ◽  
Vol 29 (03) ◽  
pp. 200-204 ◽  
Author(s):  
J. A. Koziol

AbstractA basic problem of cluster analysis is the determination or selection of the number of clusters evinced in any set of data. We address this issue with multinomial data using Akaike’s information criterion and demonstrate its utility in identifying an appropriate number of clusters of tumor types with similar profiles of cell surface antigens.


2013 ◽  
Vol 40 (2) ◽  
pp. 320-336 ◽  
Author(s):  
J. Andrew Howe ◽  
Hamparsum Bozdogan

2019 ◽  
Vol 4 (1) ◽  
pp. 64-67
Author(s):  
Pavel Kim

One of the fundamental tasks of cluster analysis is the partitioning of multidimensional data samples into groups of clusters – objects, which are closed in the sense of some given measure of similarity. In a some of problems, the number of clusters is set a priori, but more often it is required to determine them in the course of solving clustering. With a large number of clusters, especially if the data is “noisy,” the task becomes difficult for analyzing by experts, so it is artificially reduces the number of consideration clusters. The formal means of merging the “neighboring” clusters are considered, creating the basis for parameterizing the number of significant clusters in the “natural” clustering model [1].


2007 ◽  
Vol 11 (2) ◽  
pp. 155-173 ◽  
Author(s):  
Jaime R.S. Fonseca ◽  
Margarida G.M.S. Cardoso

2017 ◽  
Vol 13 (2) ◽  
pp. 1-12 ◽  
Author(s):  
Jungmok Ma

One of major obstacles in the application of the k-means clustering algorithm is the selection of the number of clusters k. The multi-attribute utility theory (MAUT)-based k-means clustering algorithm is proposed to tackle the problem by incorporating user preferences. Using MAUT, the decision maker's value structure for the number of clusters and other attributes can be quantitatively modeled, and it can be used as an objective function of the k-means. A target clustering problem for military targeting process is used to demonstrate the MAUT-based k-means and provide a comparative study. The result shows that the existing clustering algorithms do not necessarily reflect user preferences while the MAUT-based k-means provides a systematic framework of preferences modeling in cluster analysis.


1995 ◽  
Vol 12 (1) ◽  
pp. 113-136 ◽  
Author(s):  
R. Gnanadesikan ◽  
J. R. Kettenring ◽  
S. L. Tsao

2008 ◽  
Vol 2 (1) ◽  
pp. 65-70 ◽  
Author(s):  
M. Cabello ◽  
J. A. G. Orza ◽  
V. Galiano ◽  
G. Ruiz

Abstract. Backtrajectory differences and clustering sensitivity to the meteorological input data are studied. Trajectories arriving in Southeast Spain (Elche), at 3000, 1500 and 500 m for the 7-year period 2000–2006 have been computed employing two widely used meteorological data sets: the NCEP/NCAR Reanalysis and the FNL data sets. Differences between trajectories grow linearly at least up to 48 h, showing faster growing after 72 h. A k-means cluster analysis performed on each set of trajectories shows differences in the identified clusters (main flows), partially because the number of clusters of each clustering solution differs for the trajectories arriving at 3000 and 1500 m. Trajectory membership to the identified flows is in general more sensitive to the input meteorological data than to the initial selection of cluster centroids.


Sign in / Sign up

Export Citation Format

Share Document