scholarly journals The Complete Gradient Clustering Algorithm: properties in practical applications

2012 ◽  
Vol 39 (6) ◽  
pp. 1211-1224 ◽  
Author(s):  
Piotr Kulczycki ◽  
Malgorzata Charytanowicz ◽  
Piotr A. Kowalski ◽  
Szymon Lukasik
2019 ◽  
Vol 5 (11) ◽  
pp. 85 ◽  
Author(s):  
Ayan Chatterjee ◽  
Peter W. T. Yuen

This paper proposes a simple yet effective method for improving the efficiency of sparse coding dictionary learning (DL) with an implication of enhancing the ultimate usefulness of compressive sensing (CS) technology for practical applications, such as in hyperspectral imaging (HSI) scene reconstruction. CS is the technique which allows sparse signals to be decomposed into a sparse representation “a” of a dictionary D u . The goodness of the learnt dictionary has direct impacts on the quality of the end results, e.g., in the HSI scene reconstructions. This paper proposes the construction of a concise and comprehensive dictionary by using the cluster centres of the input dataset, and then a greedy approach is adopted to learn all elements within this dictionary. The proposed method consists of an unsupervised clustering algorithm (K-Means), and it is then coupled with an advanced sparse coding dictionary (SCD) method such as the basis pursuit algorithm (orthogonal matching pursuit, OMP) for the dictionary learning. The effectiveness of the proposed K-Means Sparse Coding Dictionary (KMSCD) is illustrated through the reconstructions of several publicly available HSI scenes. The results have shown that the proposed KMSCD achieves ~40% greater accuracy, 5 times faster convergence and is twice as robust as that of the classic Spare Coding Dictionary (C-SCD) method that adopts random sampling of data for the dictionary learning. Over the five data sets that have been employed in this study, it is seen that the proposed KMSCD is capable of reconstructing these scenes with mean accuracies of approximately 20–500% better than all competing algorithms adopted in this work. Furthermore, the reconstruction efficiency of trace materials in the scene has been assessed: it is shown that the KMSCD is capable of recovering ~12% better than that of the C-SCD. These results suggest that the proposed DL using a simple clustering method for the construction of the dictionary has been shown to enhance the scene reconstruction substantially. When the proposed KMSCD is incorporated with the Fast non-negative orthogonal matching pursuit (FNNOMP) to constrain the maximum number of materials to coexist in a pixel to four, experiments have shown that it achieves approximately ten times better than that constrained by using the widely employed TMM algorithm. This may suggest that the proposed DL method using KMSCD and together with the FNNOMP will be more suitable to be the material allocation module of HSI scene simulators like the CameoSim package.


Algorithms ◽  
2020 ◽  
Vol 13 (7) ◽  
pp. 158
Author(s):  
Tran Dinh Khang ◽  
Nguyen Duc Vuong ◽  
Manh-Kien Tran ◽  
Michael Fowler

Clustering is an unsupervised machine learning technique with many practical applications that has gathered extensive research interest. Aside from deterministic or probabilistic techniques, fuzzy C-means clustering (FCM) is also a common clustering technique. Since the advent of the FCM method, many improvements have been made to increase clustering efficiency. These improvements focus on adjusting the membership representation of elements in the clusters, or on fuzzifying and defuzzifying techniques, as well as the distance function between elements. This study proposes a novel fuzzy clustering algorithm using multiple different fuzzification coefficients depending on the characteristics of each data sample. The proposed fuzzy clustering method has similar calculation steps to FCM with some modifications. The formulas are derived to ensure convergence. The main contribution of this approach is the utilization of multiple fuzzification coefficients as opposed to only one coefficient in the original FCM algorithm. The new algorithm is then evaluated with experiments on several common datasets and the results show that the proposed algorithm is more efficient compared to the original FCM as well as other clustering methods.


Algorithms ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 258
Author(s):  
Tran Dinh Khang ◽  
Manh-Kien Tran ◽  
Michael Fowler

Clustering is an unsupervised machine learning method with many practical applications that has gathered extensive research interest. It is a technique of dividing data elements into clusters such that elements in the same cluster are similar. Clustering belongs to the group of unsupervised machine learning techniques, meaning that there is no information about the labels of the elements. However, when knowledge of data points is known in advance, it will be beneficial to use a semi-supervised algorithm. Within many clustering techniques available, fuzzy C-means clustering (FCM) is a common one. To make the FCM algorithm a semi-supervised method, it was proposed in the literature to use an auxiliary matrix to adjust the membership grade of the elements to force them into certain clusters during the computation. In this study, instead of using the auxiliary matrix, we proposed to use multiple fuzzification coefficients to implement the semi-supervision component. After deriving the proposed semi-supervised fuzzy C-means clustering algorithm with multiple fuzzification coefficients (sSMC-FCM), we demonstrated the convergence of the algorithm and validated the efficiency of the method through a numerical example.


Author(s):  
Behzad Bahrami ◽  
Masoud Shafiee

In recent years, singular systems and fuzzy descriptor have attracted a lot of researchers' attention due to their wide practical applications for modeling complex phenomena. In this study, one approach proposed is the fuzzy clustering algorithm based on linear structures to identify the neuro Fuzzy local linear models. Additionally fuzzy descriptor models, a recently proposed neuro fuzzy interpretation of locally linear models, are implemented because of their promise for intuitive incremental learning algorithms e.g. Generalized Fuzzy Clustering Variety (GFCV). The results from the fuzzy descriptor models are compared to the results of several other methods. An efficient technique, based on the error indices of multiple validation sets, is used to optimize the number of neurons and prevent the algorithm from over fitting. The scope of this work is to reveal the advantages of fuzzy descriptor models and compare them to the most successful neural and neuro fuzzy approaches based on prediction accuracy, generalization, and computational complexity. The proposed solution is shown to accurately forecast seismic time series, outperforming several other methods.


Author(s):  
Yang Xindi ◽  
Du Huanran

The intelligent scheduling algorithm for hierarchical data migration is a key issue in data management. Mass media content platforms and the discovery of content object usage patterns is the basic schedule of data migration. We add QPop, the dimensionality reduction result of media content usage logs, as content objects for discovering usage patterns. On this basis, a clustering algorithm QPop is proposed to increase the time segmentation, thereby improving the mining performance. We hired the standard C-means algorithm as the clustering core and used segmentation to conduct an experimental mining process to collect the ted QPop increments in practical applications. The results show that the improved algorithm has good robustness in cluster cohesion and other indicators, slightly better than the basic model.


Author(s):  
Praphula Jain ◽  
Mani Shankar Bajpai ◽  
Rajendra Pamula

Anomaly detection concerns identifying anomalous observations or patterns that are a deviation from the dataset's expected behaviour. The detection of anomalies has significant and practical applications in several industrial domains such as public health, finance, Information Technology (IT), security, medical, energy, and climate studies. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) Algorithm is a density-based clustering algorithm with the capability of identifying anomalous data. In this paper, a modified DBSCAN algorithm is proposed for anomaly detection in time-series data with seasonality. For experimental evaluation, a monthly temperature dataset was employed and the analysis set forth the advantages of the modified DBSCAN over the standard DBSCAN algorithm for the seasonal datasets. From the result analysis, we may conclude that DBSCAN is used for finding the anomalies in a dataset but fails to find local anomalies in seasonal data. The proposed Modified DBSCAN approach helps to find both the global and local anomalies from the seasonal data. Using normal DBSCAN we are able to get 19 (2.16%) anomaly points. While using the modified approach for DBSCAN we are able to get 42 (4.79%) anomaly points. In comparison we can say that we are able to get 2.11% more anomalies using the modified DBSCAN approach. Hence, the proposed Modified DBSCAN algorithm outperforms in comparison with the DBSCAN algorithm to find local anomalies.


Author(s):  
Chao Zhao ◽  
Hongling Yang ◽  
Xiaoqian Li ◽  
Rui Li ◽  
ShouCun Zheng

The intelligent scheduling algorithm for hierarchical data migration is a key issue in data management. Mass media content platforms and the discovery of content object usage patterns is the basic schedule of data migration. We add QPop, the dimensionality reduction result of media content usage logs, as content objects for discovering usage patterns. On this basis, a clustering algorithm QPop is proposed to increase the time segmentation, thereby improving the mining performance. We hired the standard C-means algorithm as the clustering core and used segmentation to conduct an experimental mining process to collect the ted QPop increments in practical applications. The results show that the improved algorithm has good robustness in cluster cohesion and other indicators, slightly better than the basic model.


2019 ◽  
Vol 28 (04) ◽  
pp. 1950065 ◽  
Author(s):  
Wei Zhang ◽  
Gongxuan Zhang ◽  
Xiaohui Chen ◽  
Yueqi Liu ◽  
Xiumin Zhou ◽  
...  

Hierarchical clustering is a classical method to provide a hierarchical representation for the purpose of data analysis. However, in practical applications, it is difficult to deal with massive datasets due to their high computation complexity. To overcome this challenge, this paper presents a novel distributed storage and computation hierarchical clustering algorithm, which has a lower time complexity than the standard hierarchical clustering algorithms. Our proposed approach is suitable for hierarchical clustering on massive datasets, which has the following advantages. First, the algorithm is able to store massive dataset exceeding the main memory space by using distributed storage nodes. Second, the algorithm is able to efficiently process nearest neighbor searching along parallel lines by using distributed computation at each node. Extensive experiments are carried out to validate the effectiveness of the DHC algorithm. Experimental results demonstrate that the algorithm is 10 times faster than the standard hierarchical clustering algorithm, which is an effective and flexible distributed algorithm of hierarchical clustering for massive datasets.


2021 ◽  
Author(s):  
Lida Huang ◽  
Panpan Shi ◽  
Haichao Zhu ◽  
Tao Chen

Abstract Emergency events need early detection, quick response, and accuracy recover. In the era of big data, social media users can be seen as social sensors to monitor real time emergency events. This paper proposed an integrated approach to early detect all the four kinds of emergency events including natural disasters, man-made accidents, public health events and social security events. First, the BERT-Att-BiLSTM model is used to detect emergency related posts from the massive and irrelevant data. Then, the 3W attribute information (What, Where and When) of the emergency event is extracted. With the 3W attribute information, we create an unsupervised dynamical event clustering algorithm based on text-similarity and combine it with the supervised logistical regression model to cluster posts into different events. The experiments on Sina Weibo data demonstrate the superiority of the proposed framework. Case studies on some real emergency events show the proposed framework has good performance and high timeliness. Practical applications of the framework have also been discussed, following by some future directions for improvement.


2014 ◽  
Vol 701-702 ◽  
pp. 312-315 ◽  
Author(s):  
Qi Chen ◽  
Xing Ben Yang ◽  
Yun Hong Chen ◽  
Dan Dan Li

Image segmentation plays an important role in computer vision and image processing to interpret and analyze an acquired image. Separation of objects or image regions is usually required for high-level image comprehension in practical applications involving visual inspection. In this paper, a novel automatic image segmentation method is proposed. To extract the foreground of the image automatically, we combine saliency model based on superpixels with the affinity propagation clustering algorithm in an unsupervised manner, and use graph cut method to obtain the segmentation results.


Sign in / Sign up

Export Citation Format

Share Document