A Novel K-Harmonic Means Clustering Based on Multiple Initial Centers

2013 ◽  
Vol 321-324 ◽  
pp. 1947-1950
Author(s):  
Lei Gu ◽  
Xian Ling Lu

In the initialization of the traditional k-harmonic means clustering, the initial centers are generated randomly and its number is equal to the number of clusters. Although the k-harmonic means clustering is insensitive to the initial centers, this initialization method cannot improve clustering performance. In this paper, a novel k-harmonic means clustering based on multiple initial centers is proposed. The number of the initial centers is more than the number of clusters in this new method. The new method with multiple initial centers can divide the whole data set into multiple groups and combine these groups into the final solution. Experiments show that the presented algorithm can increase the better clustering accuracies than the traditional k-means and k-harmonic methods.

1992 ◽  
Vol 26 (9-11) ◽  
pp. 2345-2348 ◽  
Author(s):  
C. N. Haas

A new method for the quantitative analysis of multiple toxicity data is described and illustrated using a data set on metal exposure to copepods. Positive interactions are observed for Ni-Pb and Pb-Cr, with weak negative interactions observed for Ni-Cr.


Author(s):  
Fred L. Bookstein

AbstractA matrix manipulation new to the quantitative study of develomental stability reveals unexpected morphometric patterns in a classic data set of landmark-based calvarial growth. There are implications for evolutionary studies. Among organismal biology’s fundamental postulates is the assumption that most aspects of any higher animal’s growth trajectories are dynamically stable, resilient against the types of small but functionally pertinent transient perturbations that may have originated in genotype, morphogenesis, or ecophenotypy. We need an operationalization of this axiom for landmark data sets arising from longitudinal data designs. The present paper introduces a multivariate approach toward that goal: a method for identification and interpretation of patterns of dynamical stability in longitudinally collected landmark data. The new method is based in an application of eigenanalysis unfamiliar to most organismal biologists: analysis of a covariance matrix of Boas coordinates (Procrustes coordinates without the size standardization) against their changes over time. These eigenanalyses may yield complex eigenvalues and eigenvectors (terms involving $$i=\sqrt{-1}$$ i = - 1 ); the paper carefully explains how these are to be scattered, gridded, and interpreted by their real and imaginary canonical vectors. For the Vilmann neurocranial octagons, the classic morphometric data set used as the running example here, there result new empirical findings that offer a pattern analysis of the ways perturbations of growth are attenuated or otherwise modified over the course of developmental time. The main finding, dominance of a generalized version of dynamical stability (negative autoregressions, as announced by the negative real parts of their eigenvalues, often combined with shearing and rotation in a helpful canonical plane), is surprising in its strength and consistency. A closing discussion explores some implications of this novel pattern analysis of growth regulation. It differs in many respects from the usual way covariance matrices are wielded in geometric morphometrics, differences relevant to a variety of study designs for comparisons of development across species.


2021 ◽  
Vol 6 (2) ◽  
pp. 48
Author(s):  
Solmin Paembonan ◽  
Hisma Abduh

Dalam penelitian ini menggunakan metode k-means, metode ini dapat digunakan untuk menjadikan beberapa obat yang mirip menjadi suatu kelompok data tertentu. Salah satu cara untuk mengetahui tingkat kemiripan data adalah melalui perhitungan jarak antar data. Semakain kecil jarak antar data semakin tinggi tingkat kemiripan data tersebut dan sebaliknya semakin besar jarak antar data maka semakin rendah tingkat kemiripannya. Tujuan akhir clustering adalah untuk menentukan kelompok dalam sekumpulan data yang tidak berlabel, karena clustering merupakan suatu metode unsupervised dan tidak terdapat suatu kondisi awal untuk sejumlah cluster yang mungkin terbentuk dalam sekumpulan data, maka dibutuhkan suatu evaluasi hasil clustering. Berdasarkan evaluasi yang dilakukan terhadap hasil clustering dengan nilai dari silhouette coeficient = 0,4854. In this study using the k-means method, this method can be used to make several similar drugs into a certain data group. One way to determine the level of similarity of the data is through the calculation of the distance between the data. The smaller the distance between the data, the higher the level of similarity between the data and vice versa, the greater the distance between the data, the lower the similarity level. For a number of clusters that may be formed in a data set, an evaluation of the results of clustering is needed. Based on the evaluation carried out on the results of clustering with the value of the silhouette coefficient = 0.4854.


2020 ◽  
Vol 11 (3) ◽  
pp. 42-67
Author(s):  
Soumeya Zerabi ◽  
Souham Meshoul ◽  
Samia Chikhi Boucherkha

Cluster validation aims to both evaluate the results of clustering algorithms and predict the number of clusters. It is usually achieved using several indexes. Traditional internal clustering validation indexes (CVIs) are mainly based in computing pairwise distances which results in a quadratic complexity of the related algorithms. The existing CVIs cannot handle large data sets properly and need to be revisited to take account of the ever-increasing data set volume. Therefore, design of parallel and distributed solutions to implement these indexes is required. To cope with this issue, the authors propose two parallel and distributed models for internal CVIs namely for Silhouette and Dunn indexes using MapReduce framework under Hadoop. The proposed models termed as MR_Silhouette and MR_Dunn have been tested to solve both the issue of evaluating the clustering results and identifying the optimal number of clusters. The results of experimental study are very promising and show that the proposed parallel and distributed models achieve the expected tasks successfully.


2018 ◽  
Vol 2018 ◽  
pp. 1-10
Author(s):  
Siyu Ji ◽  
Chenglin Wen

Neural network is a data-driven algorithm; the process established by the network model requires a large amount of training data, resulting in a significant amount of time spent in parameter training of the model. However, the system modal update occurs from time to time. Prediction using the original model parameters will cause the output of the model to deviate greatly from the true value. Traditional methods such as gradient descent and least squares methods are all centralized, making it difficult to adaptively update model parameters according to system changes. Firstly, in order to adaptively update the network parameters, this paper introduces the evaluation function and gives a new method to evaluate the parameters of the function. The new method without changing other parameters of the model updates some parameters in the model in real time to ensure the accuracy of the model. Then, based on the evaluation function, the Mean Impact Value (MIV) algorithm is used to calculate the weight of the feature, and the weighted data is brought into the established fault diagnosis model for fault diagnosis. Finally, the validity of this algorithm is verified by the example of UCI-Combined Cycle Power Plant (UCI-ccpp) simulation of standard data set.


1994 ◽  
Vol 158 ◽  
pp. 197-200
Author(s):  
J.-L. Monin ◽  
N. Ageorges ◽  
L. Desbat ◽  
C. Perrier

A new method to reconstruct the phase of bidimensional interferograms, obtained through pupil-plane interferometry is presented. We compute the average complex phasor components of the cross-spectrum on a data set to reconstruct the original unperturbed phase. We present preliminary results on simulated images which visibility phases are distorted using a model of atmospheric perturbed wavefronts.


2018 ◽  
Vol 7 (2.5) ◽  
pp. 1
Author(s):  
Khalil Khan ◽  
Nasir Ahmad ◽  
Irfan Uddin ◽  
Muhammad Ehsan Mazhar ◽  
Rehan Ullah Khan

Background and objective: A novel face parsing method is proposed in this paper which partition facial image into six semantic classes. Unlike previous approaches which segmented a facial image into three or four classes, we extended the class labels to six. Materials and Methods: A data-set of 464 images taken from FEI, MIT-CBCL, Pointing’04 and SiblingsDB databases was annotated. A discriminative model was trained by extracting features from squared patches. The built model was tested on two different semantic segmentation approaches – pixel-based and super-pixel-based semantic segmentation (PB_SS and SPB_SS).Results: A pixel labeling accuracy (PLA) of 94.68% and 90.35% was obtained with PB_SS and SPB_SS methods respectively on frontal images. Conclusions: A new method for face parts parsing was proposed which efficiently segmented a facial image into its constitute parts.


1860 ◽  
Vol 10 ◽  
pp. 473-475

I found my method on the known principle, that the geometric mean between two quantities is also a geometric mean between the arithmetic and harmonic means of those quantities. We may therefore approximate to the geometric mean of two quantities in this way:—Take their arithmetic and harmonic means; then take the arithmetic and harmonic means of those means; then of these last means again, and so on, as far as we please. If the ratio of the original quantities lies within the ratio of 1 : 2, the approximation proceeds with extraordinary rapidity, so that, in obtaining a fraction nearly equal to √2 by this method, we obtain a result true to eleven places of decimals at the fourth mean. I name this merely to show the rate of approximation. The real application of the method is to the integration of functions embracing a radical of the square root.


2014 ◽  
Vol 2 (1) ◽  
pp. 97-104 ◽  
Author(s):  
S. Hergarten ◽  
J. Robl ◽  
K. Stüwe

Abstract. We present a new method to extend the widely used geomorphic technique of swath profiles towards curved geomorphic structures such as river valleys. In contrast to the established method that hinges on stacking parallel cross sections, our approach does not refer to any individual profile lines, but uses the signed distance from a given baseline (for example, a valley floor) as the profile coordinate. The method can be implemented easily for arbitrary polygonal baselines and for rastered digital elevation models as well as for irregular point clouds such as laser scanner data. Furthermore it does not require any smoothness of the baseline and avoids over- and undersampling due to the curvature of the baseline. The versatility of the new method is illustrated by its application to topographic profiles across valleys, a large subduction zone, and the rim of an impact crater. Similarly to the ordinary swath profile method, the new method is not restricted to analyzing surface elevations themselves, but can aid the quantitative description of topography by analyzing other geomorphic features such as slope or local relief. It is even not constrained to geomorphic data, but can be applied to any two-dimensional data set such as temperature, precipitation or ages of rocks.


Sign in / Sign up

Export Citation Format

Share Document