A new validity clustering index-based on finding new centroid positions using the mean of clustered data to determine the optimum number of clusters

This paper presents an optimizing method of competitive neural network(CNN):During clustering analysis fixed on the optimum number of output neurons according to the change of DB value，and then adjusted connected weight including increasing ,dividing , delete. Each neuron had the different variety trend of learning rate according with the change of the probability of neurons. The optimizing method made classification more accurate. Simulation results showed that optimized network structure had a strong ability to adjust the number of clusters dynamically and good results of classification.

Download Full-text

Mapping similarities in temporal parking occupancy behavior based on city-wide parking meter data

Proceedings of the ICA ◽

10.5194/ica-proc-1-12-2018 ◽

2018 ◽

Vol 1 ◽

pp. 1-5

Author(s):

Fabian Bock ◽

Karen Xia ◽

Monika Sester

Keyword(s):

Similarity Measure ◽

Historical Data ◽

Parking Space ◽

Number Of Clusters ◽

Occupancy Patterns ◽

Silhouette Index ◽

The Mean ◽

Occupancy Behavior ◽

Maximum Occupancy ◽

Repetitive Pattern

The search for a parking space is a severe and stressful problem for drivers in many cities. The provision of maps with parking space occupancy information assists drivers in avoiding the most crowded roads at certain times. Since parking occupancy reveals a repetitive pattern per day and per week, typical parking occupancy patterns can be extracted from historical data.<br> In this paper, we analyze city-wide parking meter data from Hannover, Germany, for a full year. We describe an approach of clustering these parking meters to reduce the complexity of this parking occupancy information and to reveal areas with similar parking behavior. The parking occupancy at every parking meter is derived from a timestamp of ticket payment and the validity period of the parking tickets. The similarity of the parking meters is computed as the mean-squared deviation of the average daily patterns in parking occupancy at the parking meters. Based on this similarity measure, a hierarchical clustering is applied. The number of clusters is determined with the Davies-Bouldin Index and the Silhouette Index.<br> Results show that, after extensive data cleansing, the clustering leads to three clusters representing typical parking occupancy day patterns. Those clusters differ mainly in the hour of the maximum occupancy. In addition, the lo-cations of parking meter clusters, computed only based on temporal similarity, also show clear spatial distinctions from other clusters.

Download Full-text

Catalogue of Clusters that are Members of Superclusters

Symposium - International Astronomical Union ◽

10.1017/s0074180900038924 ◽

1983 ◽

Vol 104 ◽

pp. 185-186

Author(s):

M. Kalinkov ◽

K. Stavrev ◽

I. Kuneva

Keyword(s):

Standard Deviation ◽

Correlation Coefficient ◽

The Other ◽

Clusters Of Galaxies ◽

Number Of Clusters ◽

Abell Clusters ◽

The Mean ◽

Abell Cluster ◽

Image Position

An attempt is made to establish the membership of Abell clusters in superclusters of galaxies. The relation is used to calibrate the distances to the clusters of galaxies with two redshift estimates. One is m10, the magnitude of the ten-ranked galaxy, and the other is the “mean population,” P, defined by: where p = 40, 65, 105 … galaxies for richness groups 0, 1, 2 …, and r is the apparent radius in degrees given by: The first iteration for redshift, z1, is obtained from m10 alone: The standard deviation for Eq. (1) is 0.105, the number of clusters with known velocities is 342 and the correlation coefficient between observed and fitted values is 0.921. With zi from Eq. (1), we define Cartesian galactic coordinates Xi = Rih−1 cosBi cosLi, Yi = Rih−1 cosBi sinLi, Zi = Rih−1 sinBi for each Abell cluster, i = 1, …, 2712, where Ri is the distance to the cluster (Mpc), and Ho = 100 h km s−1 Mpc−1.

Download Full-text

Generating Optimum Number of Clusters Using Median Search and Projection Algorithms

2010 IEEE 24th International Conference on Advanced Information Networking and Applications Workshops ◽

10.1109/waina.2010.196 ◽

2010 ◽

Author(s):

Suresh L. ◽

Jay B. Simha ◽

Rajappa Veluru

Keyword(s):

Optimum Number ◽

Projection Algorithms ◽

Number Of Clusters

Download Full-text

Practical and Effective Approaches to Dealing With Clustered Data

Political Science Research and Methods ◽

10.1017/psrm.2017.42 ◽

2018 ◽

Vol 7 (3) ◽

pp. 541-559 ◽

Cited By ~ 20

Author(s):

Justin Esarey ◽

Andrew Menger

Keyword(s):

Recent Work ◽

Clustered Data ◽

Standard Errors ◽

Data Sets ◽

Number Of Clusters ◽

Uncertainty Measures ◽

R Packages ◽

Robust Standard Errors

Cluster-robust standard errors (as implemented by the eponymous cluster option in Stata) can produce misleading inferences when the number of clusters G is small, even if the model is consistent and there are many observations in each cluster. Nevertheless, political scientists commonly employ this method in data sets with few clusters. The contributions of this paper are: (a) developing new and easy-to-use Stata and R packages that implement alternative uncertainty measures robust to small G, and (b) explaining and providing evidence for the advantages of these alternatives, especially cluster-adjusted t-statistics based on Ibragimov and Müller. To illustrate these advantages, we reanalyze recent work where results are based on cluster-robust standard errors.

Download Full-text

Fuzzy C-Means Based Liver CT Image Segmentation with Optimum Number of Clusters

Advances in Intelligent Systems and Computing - Proceedings of the Fifth International Conference on Innovations in Bio-Inspired Computing and Applications IBICA 2014 ◽

10.1007/978-3-319-08156-4_14 ◽

2014 ◽

pp. 131-139 ◽

Cited By ~ 3

Author(s):

Abder-Rahman Ali ◽

Micael Couceiro ◽

Aboul Ella Hassanien ◽

Mohamed F. Tolba ◽

Václav Snášel

Keyword(s):

Image Segmentation ◽

Optimum Number ◽

Ct Image ◽

Number Of Clusters ◽

Liver Ct ◽

Fuzzy C Means

Download Full-text

Baffin Bay/Nares Strait surface (seafloor) sediment mineralogy: further investigations and methods to elucidate spatial variations in provenance

Canadian Journal of Earth Sciences ◽

10.1139/cjes-2018-0207 ◽

2019 ◽

Vol 56 (8) ◽

pp. 814-828 ◽

Cited By ~ 3

Author(s):

John T. Andrews

Keyword(s):

Strong Association ◽

Optimum Number ◽

Baffin Island ◽

Number Of Clusters ◽

Data Transformations ◽

West Greenland ◽

Weight Percentage ◽

Nares Strait ◽

Significant Difference ◽

Sediment Mineralogy

The goal of the paper is to ascertain whether there are significant regional variations in sediment mineral composition that might be used to elucidate ice sheet histories. The weight percentages of nonclay and clay minerals were determined by quantitative X-ray diffraction. Cluster analysis, an unsupervised learning approach, is used to group sediment mineralogy of 263 seafloor/core top samples between ∼80°N and 62°N. The optimum number of clusters, based on 30 indexes, was three for the weight percentage data but varied with data transformations. Maps of the distribution of the three mineral clusters or facies indicate a significant difference in weight percentages between samples from the West Greenland and Baffin Island shelves. However, several indexes support a larger number of clusters and similar analyses of the spatial distribution and defining minerals of nine mineral facies indicated a strong association with the original three clusters and with broad geographic designations (i.e., West Greenland shelf, Baffin Island fiords, etc). Classification Decision Tree analysis indicates that this difference is primarily controlled by the percentages of plagioclase feldspars versus alkali feldspars.

Download Full-text

Generating Optimum Number of Clusters Using Median Search and Projection Algorithms

2010 International Conference on Advances in Computer Engineering ◽

10.1109/ace.2010.95 ◽

2010 ◽

Author(s):

Suresh L. ◽

Jay B. Simha ◽

Rajappa Veluru

Keyword(s):

Optimum Number ◽

Projection Algorithms ◽

Number Of Clusters

Download Full-text

Implementasi Metode Improved K-Means dengan Algoritma Dbscan untuk Pengelompokan Film

Unisda Journal of Mathematics and Computer Science (UJMC) ◽

10.52166/ujmc.v6i01.1923 ◽

2020 ◽

Vol 6 (01) ◽

pp. 1-8

Author(s):

Muhammad Muhajir ◽

Annisa Ayunda Permata Sari

Keyword(s):

Film Industry ◽

Optimal Number ◽

Optimum Number ◽

Number Of Clusters ◽

The Past ◽

Box Office ◽

Dbscan Algorithm ◽

The People ◽

The World ◽

Optimal Number Of Clusters

The Indonesian film industry continues to experience an increase seen from the number of films that appear in theaters today with a box office increase of 28 percent each year in the past four years. Internet Movie Database (IMDb) is a website that provides information about films around the world, including the people involved in it from actors, directors, writers to makeup artists and soundtracks. In this case the researcher wants to conduct research on the characteristics of the film and the factors that make a film to be included in the IMDb Top 250. The data used in this study uses scraped data from the website. The method used is a non-hierarchical clustering method, namely kmeans and Dbscan. Where the Dbscan algorithm is used to determine the optimum number of clusters then proceed by grouping data based on centroids with k-means algorithm. From the analysis it was found that the factors that could influence a film included in the IMDB Top 250 were duration, number of votes, and films directed by Rajkumar Hirani and the optimal number of clusters using Dbscan algorithm obtained six clusters. With the improved k-means algorithm, the accuracy value for the cluster results is 87.2%.

Download Full-text

Analisis Cluster Data Interkomparasi Anak Timbangan dengan Algoritma Self Organizing Maps

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v7i2.3698 ◽

2021 ◽

Vol 7 (2) ◽

Author(s):

Arif Fajar Solikin ◽

Kusrini Kusrini ◽

Ferry Wahyu Wibowo

Keyword(s):

Data Mining ◽

Statistical Test ◽

Optimum Number ◽

Data Normalization ◽

Self Organizing Map ◽

Self Organizing Maps ◽

Number Of Clusters ◽

Map Algorithm ◽

Cluster Data ◽

Self Organizing

Intercomparison was conducted to determine the ability and the performance of the laboratory. Intercomparison results are usually expressed in the range of En ratio values (En ?|1|) which express the equivalence of one laboratory with other laboratories. If the laboratory is declared unequal, then it needs to identify the source of the problem by itself. To make it easier, it can be done by Clustering which is one of the data mining techniques. Clustering is done by applying a self organizing map algorithm on the KNIME (Konstanz Information Miner) analytic tools. Several experiments were carried out with different layer size and data normalization status from one experiment to another experiment. The results were analyzed through pseudo F statistical test and icdrate test. The largest pseudo F statistic value was obtained from the 8th experiment (setting the layer size 2x2 without data normalization) with a pseudo F statistic value of 167.53 for 1kg artifacts and a Pseudo F statistic value of 104.86 for 200 g artifacts where the optimum number of clusters are 4. The smallest icdrate value was obtained from the 5th experiment (setting the 2x3 layer size without data normalization) with an icdrate value of 0.0713 for 1kg artifacts and icdrate value of 0.2889 for 200g artifacts with the best number of clusters being 6. From 12 laboratories can be grouped into 6 groups where each group has the same identification. There are groups 1, 3 and 6 have 1 member, while groups 2, 4 and 5 have 3 members.

Download Full-text