scholarly journals Evaluation of Clustering Algorithms on HPC Platforms

Mathematics ◽  
2021 ◽  
Vol 9 (17) ◽  
pp. 2156
Author(s):  
Juan M. Cebrian ◽  
Baldomero Imbernón ◽  
Jesús Soto ◽  
José M. Cecilia

Clustering algorithms are one of the most widely used kernels to generate knowledge from large datasets. These algorithms group a set of data elements (i.e., images, points, patterns, etc.) into clusters to identify patterns or common features of a sample. However, these algorithms are very computationally expensive as they often involve the computation of expensive fitness functions that must be evaluated for all points in the dataset. This computational cost is even higher for fuzzy methods, where each data point may belong to more than one cluster. In this paper, we evaluate different parallelisation strategies on different heterogeneous platforms for fuzzy clustering algorithms typically used in the state-of-the-art such as the Fuzzy C-means (FCM), the Gustafson–Kessel FCM (GK-FCM) and the Fuzzy Minimals (FM). The experimental evaluation includes performance and energy trade-offs. Our results show that depending on the computational pattern of each algorithm, their mathematical foundation and the amount of data to be processed, each algorithm performs better on a different platform.

2020 ◽  
Vol 34 (04) ◽  
pp. 5700-5708 ◽  
Author(s):  
Jianghao Shen ◽  
Yue Wang ◽  
Pengfei Xu ◽  
Yonggan Fu ◽  
Zhangyang Wang ◽  
...  

While increasingly deep networks are still in general desired for achieving state-of-the-art performance, for many specific inputs a simpler network might already suffice. Existing works exploited this observation by learning to skip convolutional layers in an input-dependent manner. However, we argue their binary decision scheme, i.e., either fully executing or completely bypassing one layer for a specific input, can be enhanced by introducing finer-grained, “softer” decisions. We therefore propose a Dynamic Fractional Skipping (DFS) framework. The core idea of DFS is to hypothesize layer-wise quantization (to different bitwidths) as intermediate “soft” choices to be made between fully utilizing and skipping a layer. For each input, DFS dynamically assigns a bitwidth to both weights and activations of each layer, where fully executing and skipping could be viewed as two “extremes” (i.e., full bitwidth and zero bitwidth). In this way, DFS can “fractionally” exploit a layer's expressive power during input-adaptive inference, enabling finer-grained accuracy-computational cost trade-offs. It presents a unified view to link input-adaptive layer skipping and input-adaptive hybrid quantization. Extensive experimental results demonstrate the superior tradeoff between computational cost and model expressive power (accuracy) achieved by DFS. More visualizations also indicate a smooth and consistent transition in the DFS behaviors, especially the learned choices between layer skipping and different quantizations when the total computational budgets vary, validating our hypothesis that layer quantization could be viewed as intermediate variants of layer skipping. Our source code and supplementary material are available at https://github.com/Torment123/DFS.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 511
Author(s):  
Syed Mohammad Minhaz Hossain ◽  
Kaushik Deb ◽  
Pranab Kumar Dhar ◽  
Takeshi Koshiba

Proper plant leaf disease (PLD) detection is challenging in complex backgrounds and under different capture conditions. For this reason, initially, modified adaptive centroid-based segmentation (ACS) is used to trace the proper region of interest (ROI). Automatic initialization of the number of clusters (K) using modified ACS before recognition increases tracing ROI’s scalability even for symmetrical features in various plants. Besides, convolutional neural network (CNN)-based PLD recognition models achieve adequate accuracy to some extent. However, memory requirements (large-scaled parameters) and the high computational cost of CNN-based PLD models are burning issues for the memory restricted mobile and IoT-based devices. Therefore, after tracing ROIs, three proposed depth-wise separable convolutional PLD (DSCPLD) models, such as segmented modified DSCPLD (S-modified MobileNet), segmented reduced DSCPLD (S-reduced MobileNet), and segmented extended DSCPLD (S-extended MobileNet), are utilized to represent the constructive trade-off among accuracy, model size, and computational latency. Moreover, we have compared our proposed DSCPLD recognition models with state-of-the-art models, such as MobileNet, VGG16, VGG19, and AlexNet. Among segmented-based DSCPLD models, S-modified MobileNet achieves the best accuracy of 99.55% and F1-sore of 97.07%. Besides, we have simulated our DSCPLD models using both full plant leaf images and segmented plant leaf images and conclude that, after using modified ACS, all models increase their accuracy and F1-score. Furthermore, a new plant leaf dataset containing 6580 images of eight plants was used to experiment with several depth-wise separable convolution models.


2021 ◽  
Vol 14 (5) ◽  
pp. 785-798
Author(s):  
Daokun Hu ◽  
Zhiwen Chen ◽  
Jianbing Wu ◽  
Jianhua Sun ◽  
Hao Chen

Persistent memory (PM) is increasingly being leveraged to build hash-based indexing structures featuring cheap persistence, high performance, and instant recovery, especially with the recent release of Intel Optane DC Persistent Memory Modules. However, most of them are evaluated on DRAM-based emulators with unreal assumptions, or focus on the evaluation of specific metrics with important properties sidestepped. Thus, it is essential to understand how well the proposed hash indexes perform on real PM and how they differentiate from each other if a wider range of performance metrics are considered. To this end, this paper provides a comprehensive evaluation of persistent hash tables. In particular, we focus on the evaluation of six state-of-the-art hash tables including Level hashing, CCEH, Dash, PCLHT, Clevel, and SOFT, with real PM hardware. Our evaluation was conducted using a unified benchmarking framework and representative workloads. Besides characterizing common performance properties, we also explore how hardware configurations (such as PM bandwidth, CPU instructions, and NUMA) affect the performance of PM-based hash tables. With our in-depth analysis, we identify design trade-offs and good paradigms in prior arts, and suggest desirable optimizations and directions for the future development of PM-based hash tables.


2018 ◽  
Vol 11 (10) ◽  
pp. 4155-4174 ◽  
Author(s):  
Benjamin Brown-Steiner ◽  
Noelle E. Selin ◽  
Ronald Prinn ◽  
Simone Tilmes ◽  
Louisa Emmons ◽  
...  

Abstract. While state-of-the-art complex chemical mechanisms expand our understanding of atmospheric chemistry, their sheer size and computational requirements often limit simulations to short lengths or ensembles to only a few members. Here we present and compare three 25-year present-day offline simulations with chemical mechanisms of different levels of complexity using the Community Earth System Model (CESM) Version 1.2 CAM-chem (CAM4): the Model for Ozone and Related Chemical Tracers, version 4 (MOZART-4) mechanism, the Reduced Hydrocarbon mechanism, and the Super-Fast mechanism. We show that, for most regions and time periods, differences in simulated ozone chemistry between these three mechanisms are smaller than the model–observation differences themselves. The MOZART-4 mechanism and the Reduced Hydrocarbon are in close agreement in their representation of ozone throughout the troposphere during all time periods (annual, seasonal, and diurnal). While the Super-Fast mechanism tends to have higher simulated ozone variability and differs from the MOZART-4 mechanism over regions of high biogenic emissions, it is surprisingly capable of simulating ozone adequately given its simplicity. We explore the trade-offs between chemical mechanism complexity and computational cost by identifying regions where the simpler mechanisms are comparable to the MOZART-4 mechanism and regions where they are not. The Super-Fast mechanism is 3 times as fast as the MOZART-4 mechanism, which allows for longer simulations or ensembles with more members that may not be feasible with the MOZART-4 mechanism given limited computational resources.


2021 ◽  
Author(s):  
Shikha Suman ◽  
Ashutosh Karna ◽  
Karina Gibert

Hierarchical clustering is one of the most preferred choices to understand the underlying structure of a dataset and defining typologies, with multiple applications in real life. Among the existing clustering algorithms, the hierarchical family is one of the most popular, as it permits to understand the inner structure of the dataset and find the number of clusters as an output, unlike popular methods, like k-means. One can adjust the granularity of final clustering to the goals of the analysis themselves. The number of clusters in a hierarchical method relies on the analysis of the resulting dendrogram itself. Experts have criteria to visually inspect the dendrogram and determine the number of clusters. Finding automatic criteria to imitate experts in this task is still an open problem. But, dependence on the expert to cut the tree represents a limitation in real applications like the fields industry 4.0 and additive manufacturing. This paper analyses several cluster validity indexes in the context of determining the suitable number of clusters in hierarchical clustering. A new Cluster Validity Index (CVI) is proposed such that it properly catches the implicit criteria used by experts when analyzing dendrograms. The proposal has been applied on a range of datasets and validated against experts ground-truth overcoming the results obtained by the State of the Art and also significantly reduces the computational cost.


Author(s):  
Yan Bai ◽  
Yihang Lou ◽  
Yongxing Dai ◽  
Jun Liu ◽  
Ziqian Chen ◽  
...  

Vehicle Re-Identification (ReID) has attracted lots of research efforts due to its great significance to the public security. In vehicle ReID, we aim to learn features that are powerful in discriminating subtle differences between vehicles which are visually similar, and also robust against different orientations of the same vehicle. However, these two characteristics are hard to be encapsulated into a single feature representation simultaneously with unified supervision. Here we propose a Disentangled Feature Learning Network (DFLNet) to learn orientation specific and common features concurrently, which are discriminative at details and invariant to orientations, respectively. Moreover, to effectively use these two types of features for ReID, we further design a feature metric alignment scheme to ensure the consistency of the metric scales. The experiments show the effectiveness of our method that achieves state-of-the-art performance on three challenging datasets.


2018 ◽  
Author(s):  
Benjamin Brown-Steiner ◽  
Noelle E. Selin ◽  
Ronald Prinn ◽  
Simone Tilmes ◽  
Louisa Emmons ◽  
...  

Abstract. While state-of-the-art complex chemical mechanisms expand our understanding of atmospheric chemistry, their sheer size and computational requirements often limit simulations to short length, or ensembles to only a few members. Here we present and compare three 25-year offline simulations with chemical mechanisms of different levels of complexity using CESM Version 1.2 CAM-chem (CAM4): the MOZART-4 mechanism, the Reduced Hydrocarbon mechanism, and the Super-Fast mechanism. We show that, for most regions and time periods, differences in simulated ozone chemistry between these three mechanisms is smaller than the model-observation differences themselves. The MOZART-4 mechanism and the Reduced Hydrocarbon are in close agreement in their representation of ozone throughout the troposphere during all time periods (annual, seasonal and diurnal). While the Super-Fast mechanism tends to have higher simulated ozone variability and differs from the MOZART-4 mechanism over regions of high biogenic emissions, it is surprisingly capable of simulating ozone adequately given its simplicity. We explore the trade-offs between chemical mechanism complexity and computational cost by identifying regions where the simpler mechanisms are comparable to the MOZART-4 mechanism, and regions where they are not. The Super-Fast mechanism is three times as fast as the MOZART-4 mechanism, which allows for longer simulations, or ensembles with more members, that may not be feasible with the MOZART-4 mechanism given limited computational resources.


Sign in / Sign up

Export Citation Format

Share Document