scholarly journals A hierarchical clustering method for random intervals based on a similarity measure

Author(s):  
Ana Belén Ramos-Guajardo

AbstractA new clustering method for random intervals that are measured in the same units over the same group of individuals is provided. It takes into account the similarity degree between the expected values of the random intervals that can be analyzed by means of a two-sample similarity bootstrap test. Thus, the expectations of each pair of random intervals are compared through that test and a p-value matrix is finally obtained. The suggested clustering algorithm considers such a matrix where each p-value can be seen at the same time as a kind of similarity between the random intervals. The algorithm is iterative and includes an objective stopping criterion that leads to statistically similar clusters that are different from each other. Some simulations to show the empirical performance of the proposal are developed and the approach is applied to two real-life situations.

Author(s):  
Mohana Priya K ◽  
Pooja Ragavi S ◽  
Krishna Priya G

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%


2019 ◽  
Vol 11 (9) ◽  
pp. 2560
Author(s):  
Hyun Ahn ◽  
Tai-Woo Chang

As the adoption of information technologies increases in the manufacturing industry, manufacturing companies should efficiently manage their data and manufacturing processes in order to enhance their manufacturing competency. Because smart factories acquire processing data from connected machines, the business process management (BPM) approach can enrich the capability of manufacturing operations management. Manufacturing companies could benefit from the well-defined methodologies and process-centric engineering practices of this BPM approach for optimizing their manufacturing processes. Based on the approach, this paper proposes a similarity-based hierarchical clustering method for manufacturing processes. To this end, first we describe process modeling based on the BPM-compliant standard so that the manufacturing processes can be controlled by BPM systems. Second, we present similarity measures for manufacturing process models that serve as a criterion for the hierarchical clustering. Then, we formulate the hierarchical clustering problem and describe an agglomerative clustering algorithm using the measured similarities. Our contribution is considered on the assumption that a manufacturing company adopts the BPM approach and it operates various manufacturing processes. We expect that our method enables manufacturing companies to design and manage a vast amount of manufacturing processes at a coarser level, and it also can be applied to various process (re)engineering problems.


2013 ◽  
Vol 380-384 ◽  
pp. 1488-1494
Author(s):  
Wang Wei ◽  
Jin Yue Peng

In the research and development of intelligence system, clustering analysis is a very important problem. According to the new direct clustering algorithm using similarity measure of Vague sets as evaluation criteria presented by paper, the Vague direct clustering method are used to analysis using different similarity measure of Vague sets. The experimental result shows that the direct clustering method based on the similarity of Vague sets is effective, and the direct clustering method based on different similarity measure of Vague sets is the same basically, but difference on the steps of clustering. To select different algorithms according different conditions in the work of the actual applications.


2010 ◽  
Vol 439-440 ◽  
pp. 1306-1311
Author(s):  
Fang Li ◽  
Qun Xiong Zhu

LSI based hierarchical agglomerative clustering algorithm is studied. Aiming to the problems of LSI based hierarchical agglomerative clustering method, NMF based hierarchical clustering method is proposed and analyzed. Two ways of implementing NMF based method are introduced. Finally the result of two groups of experiment based on the TanCorp document corpora show that the method proposed is effective.


2016 ◽  
Vol 16 (6) ◽  
pp. 711-731 ◽  
Author(s):  
Yun-Lai Zhou ◽  
Nuno M.M. Maia ◽  
Rui P.C. Sampaio ◽  
Magd Abdel Wahab

Maintenance and repairing in actual engineering for long-term used structures, such as pipelines and bridges, make structural damage detection indispensable, as an unanticipated damage may give rise to a disaster, leading to huge economic loss. A new approach for detecting structural damage using transmissibility together with hierarchical clustering and similarity analysis is proposed in this study. Transmissibility is derived from the structural dynamic responses characterizing the structural state. First, for damage detection analysis, hierarchical clustering analysis is adopted to discriminate the damaged scenarios from an unsupervised perspective, taking transmissibility as feature for discriminating damaged patterns from undamaged ones. This is unlike directly predicting the structural damage from the indicators manifestation, as sometimes this can be vague due to the small difference between damaged scenarios and the intact baseline. For comparison reasons, cosine similarity measure and distance measure are also adopted to draw out sensitive indicators, and correspondingly, these indicators will manifest in recognizing damaged patterns from the intact baseline. Finally, for verification purposes, simulated results on a 10-floor structure and experimental tests on a free-free beam are undertaken to check the suitability of the raised approach. The results of both studies are indicative of a good performance in detecting damage that might suggest potential application in actual engineering real life.


2013 ◽  
Vol 3 (4) ◽  
pp. 1-14 ◽  
Author(s):  
S. Sampath ◽  
B. Ramya

Cluster analysis is a branch of data mining, which plays a vital role in bringing out hidden information in databases. Clustering algorithms help medical researchers in identifying the presence of natural subgroups in a data set. Different types of clustering algorithms are available in the literature. The most popular among them is k-means clustering. Even though k-means clustering is a popular clustering method widely used, its application requires the knowledge of the number of clusters present in the given data set. Several solutions are available in literature to overcome this limitation. The k-means clustering method creates a disjoint and exhaustive partition of the data set. However, in some situations one can come across objects that belong to more than one cluster. In this paper, a clustering algorithm capable of producing rough clusters automatically without requiring the user to give as input the number of clusters to be produced. The efficiency of the algorithm in detecting the number of clusters present in the data set has been studied with the help of some real life data sets. Further, a nonparametric statistical analysis on the results of the experimental study has been carried out in order to analyze the efficiency of the proposed algorithm in automatic detection of the number of clusters in the data set with the help of rough version of Davies-Bouldin index.


2012 ◽  
Vol 588-589 ◽  
pp. 364-367
Author(s):  
Tao Wang ◽  
Heng Zhou ◽  
Pan Zou

A power network partitioning model based on the weighed local similarity measure is presented in this paper considering the regional decoupling characteristics of reactive power. A weighted graph model of reactive power network is established and a new measurement of local similarity based on weighed graph is defined. To utilize our measurement of similarity to partition reactive power network, a partitioning algorithm based on generalized ward hierarchical clustering method is proposed. The algorithm can ensure balance of the reactive power inside partition. Applying the proposed algorithm to IEEE 39-bus system, the results show that the proposed algorithm is feasible and effective.


Author(s):  
Shahana Bano ◽  
K.Rajasekara Rao

In this paper we proposed a method which avoids the choice of natural language processing tools such as pos taggers and parsers reduce the processing overhead. Moreover, we suggest a structure to immediately create a large-scale corpus annotated along with disease names, which can be applied to train our probabilistic model. In this proposed work context rank based hierarchical clustering method is applied on different datasets namely colon, Leukemia, MLL medical diseases. Optimal rule filtering algorithm is applied on these datasets to remove unwanted special characters for gene/protein identification. Finally, experimental results show that proposed method outperformed existing methods in terms of time and clusters space.


Sign in / Sign up

Export Citation Format

Share Document