scholarly journals Hierarchical kt jet clustering for parallel architectures

2017 ◽  
Vol 9 (2) ◽  
pp. 195-213
Author(s):  
Richárd Forster ◽  
Ágnes Fülöp

AbstractThe reconstruction and analyze of measured data play important role in the research of high energy particle physics. This leads to new results in both experimental and theoretical physics. This requires algorithm improvements and high computer capacity. Clustering algorithm makes it possible to get to know the jet structure more accurately. More granular parallelization of the kt cluster algorithms was explored by combining it with the hierarchical clustering methods used in network evaluations. The kt method allows to know the development of particles due to the collision of high-energy nucleus-nucleus. The hierarchical clustering algorithms works on graphs, so the particle information used by the standard kt algorithm was first transformed into an appropriate graph, representing the network of particles. Testing was done using data samples from the Alice offine library, which contains the required modules to simulate the ALICE detector that is a dedicated Pb-Pb detector. The proposed algorithm was compared to the FastJet toolkit's standard longitudinal invariant kt implementation. Parallelizing the standard non-optimized version of this algorithm utilizing the available CPU architecture proved to be 1:6 times faster, than the standard implementation, while the proposed solution in this paper was able to achieve a 12 times faster computing performance, also being scalable enough to efficiently run on GPUs.

2018 ◽  
Vol 10 (1) ◽  
pp. 86-109
Author(s):  
Richárd Forster ◽  
Agnes Fülöp

Abstract Following up on our previous study on applying hierarchical clustering algorithms to high energy particle physics, this paper explores the possibilities to use deep learning to generate models capable of processing the clusterization themselves. The technique chosen for training is reinforcement learning, that allows the system to evolve based on interactions between the model and the underlying graph. The result is a model, that by learning on a modest dataset of 10, 000 nodes during 70 epochs can reach 83, 77% precision for hierarchical and 86, 33% for high energy jet physics datasets in predicting the appropriate clusters.


2017 ◽  
Vol 9 (1) ◽  
pp. 49-64
Author(s):  
Richárd Forster ◽  
Ágnes Fűlőp

Abstract The numerical simulation allows to study the high energy particle physics. It plays important of role in the reconstruction and analyze of these experimental and theoretical researches. This requires a computer background with a large capacity. Jet physics is an intensively researched area, where the factorization process can be solved by algorithmic solutions. We studied parallelization of the kt cluster algorithms. This method allows to know the development of particles due to the collision of highenergy nucleus-nucleus. The Alice offline library contains the required modules to simulate the ALICE detector that is a dedicated Pb-Pb detector. Using this simulation we can generate input particles, that we can further analyzed by clustering them, reconstructing their jet structure. The FastJet toolkit is an efficient C++ implementation of the most widely used jet clustering algorithms, among them the kt clustering. Parallelizing the standard non-optimized version of this algorithm utilizing the available CPU architecture a 1:6 times faster runtime was achieved, paving the way to drastic performance increase using many-core architectures.


2019 ◽  
Vol 488 (1) ◽  
pp. 1377-1386 ◽  
Author(s):  
V Carruba ◽  
S Aljbaae ◽  
A Lucchini

ABSTRACT Asteroid families are groups of asteroids that share a common origin. They can be the outcome of a collision or be the result of the rotational failure of a parent body or its satellites. Collisional asteroid families have been identified for several decades using hierarchical clustering methods (HCMs) in proper elements domains. In this method, the distance of an asteroid from a reference body is computed, and, if it is less than a critical value, the asteroid is added to the family list. The process is then repeated with the new object as a reference, until no new family members are found. Recently, new machine-learning clustering algorithms have been introduced for the purpose of cluster classification. Here, we apply supervised-learning hierarchical clustering algorithms for the purpose of asteroid families identification. The accuracy, precision, and recall values of results obtained with the new method, when compared with classical HCM, show that this approach is able to found family members with an accuracy above 89.5 per cent, and that all asteroid previously identified as family members by traditional methods are consistently retrieved. Values of the areas under the curve coefficients below Receiver Operating Characteristic curves are also optimal, with values consistently above 85 per cent. Overall, we identify 6 new families and 13 new clumps in regions where the method can be applied that appear to be consistent and homogeneous in terms of physical and taxonomic properties. Machine-learning clustering algorithms can, therefore, be very efficient and fast tools for the problem of asteroid family identification.


2010 ◽  
Vol 439-440 ◽  
pp. 1306-1311
Author(s):  
Fang Li ◽  
Qun Xiong Zhu

LSI based hierarchical agglomerative clustering algorithm is studied. Aiming to the problems of LSI based hierarchical agglomerative clustering method, NMF based hierarchical clustering method is proposed and analyzed. Two ways of implementing NMF based method are introduced. Finally the result of two groups of experiment based on the TanCorp document corpora show that the method proposed is effective.


2019 ◽  
Vol 2019 ◽  
pp. 1-10
Author(s):  
Yaohui Liu ◽  
Dong Liu ◽  
Fang Yu ◽  
Zhengming Ma

Clustering is widely used in data analysis, and density-based methods are developed rapidly in the recent 10 years. Although the state-of-art density peak clustering algorithms are efficient and can detect arbitrary shape clusters, they are nonsphere type of centroid-based methods essentially. In this paper, a novel local density hierarchical clustering algorithm based on reverse nearest neighbors, RNN-LDH, is proposed. By constructing and using a reverse nearest neighbor graph, the extended core regions are found out as initial clusters. Then, a new local density metric is defined to calculate the density of each object; meanwhile, the density hierarchical relationships among the objects are built according to their densities and neighbor relations. Finally, each unclustered object is classified to one of the initial clusters or noise. Results of experiments on synthetic and real data sets show that RNN-LDH outperforms the current clustering methods based on density peak or reverse nearest neighbors.


2021 ◽  
Vol 20 ◽  
pp. 177-184
Author(s):  
Ozer Ozdemir ◽  
Simgenur Cerman

In data mining, one of the commonly-used techniques is the clustering. Clustering can be done by the different algorithms such as hierarchical, partitioning, grid, density and graph based algorithms. In this study first of all the concept of data mining explained, then giving information the aims of using data mining and the areas of using and then clustering and clustering algorithms that used in data mining are explained theoretically. Ultimately within the scope of this study, "Mall Customers" data set that taken from Kaggle database, based partitioned clustering and hierarchical clustering algorithms aimed at the separation of clusters according to their costumers features. In the clusters obtained by the partitional clustering algorithms, the similarity within the cluster is maximum and the similarity between the clusters is minimum. The hierarchical clustering algorithms is based on the gathering of similar features or vice versa. The partitional clustering algorithms used; k-means and PAM, hierarchical clustering algorithms used; AGNES and DIANA are algorithms. In this study, R statistical programming language was used in the application of algorithms. At the end of the study, the data set was run with clustering algorithms and the obtained analysis results were interpreted.


Author(s):  
Marek Sobolewski ◽  
Małgorzata Markowska

During the conference entitled Spatial Econometrics and Regional Economic Analyses, which took place in Lodz in 2014, there was a proposition to introduce the spatial coherence property into the Ward method, which is applied to group administrative units [1]. At each stage of agglomeration in the modified Ward method, there are included only those aggregates which are adjacent to each other. This work is an extension of this concept based upon other methods of hierarchical clustering, in particular the single and complete linkage method. The study highlighted the benefits of cluster­ing methods with the coherence property, also emphasizing the limitations of these procedures. First of all, the introduction of the restricting condition during the procedure of the hierarchical clustering reduces the homogeneity of isolated clusters. Spatial constraints may also lead to a situation where the distance between the clusters linked at a later stage is smaller than at an earlier stage (graphically, we can talk about the dendrogram “backflow”). The method of complete linkage is free of these aberration where the distance between the clusters is defined as the maximum distance between their elements. The modified clustering algorithm was implemented as an extension of STATISTICA software. Examples of an application of the hierarchical clustering method with the coherence concern sector changes in the European regional space. The aim of the analysis was to isolate spatially coherent areas that demonstrate a similar direction and intensity of structural change in selected areas of the labour market.


Author(s):  
E.D. Wolf

Most microelectronics devices and circuits operate faster, consume less power, execute more functions and cost less per circuit function when the feature-sizes internal to the devices and circuits are made smaller. This is part of the stimulus for the Very High-Speed Integrated Circuits (VHSIC) program. There is also a need for smaller, more sensitive sensors in a wide range of disciplines that includes electrochemistry, neurophysiology and ultra-high pressure solid state research. There is often fundamental new science (and sometimes new technology) to be revealed (and used) when a basic parameter such as size is extended to new dimensions, as is evident at the two extremes of smallness and largeness, high energy particle physics and cosmology, respectively. However, there is also a very important intermediate domain of size that spans from the diameter of a small cluster of atoms up to near one micrometer which may also have just as profound effects on society as “big” physics.


Author(s):  
Mohana Priya K ◽  
Pooja Ragavi S ◽  
Krishna Priya G

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%


Sign in / Sign up

Export Citation Format

Share Document