K-means Clustering Adopting rbf-Kernel

Author(s):  
ABM Shawkat Ali

Clustering technique in data mining has received a significant amount of attention from machine learning community in the last few years as one of the fundamental research area. Among the vast range of clustering algorithm, K-means is one of the most popular clustering algorithm. In this research we extend K-means algorithm by adding well known radial basis function (rbf) kernel and find better performance than classical K-means algorithm. It is a critical issue for rbf kernel, how can we select a unique parameter for optimum clustering task. This present chapter will provide a statistical based solution on this issue. The best parameter selection is considered on the basis of prior information of the data by Maximum Likelihood (ML) method and Nelder-Mead (N-M) simplex method. A rule based meta-learning approach is then proposed for automatic rbf kernel parameter selection.We consider 112 supervised data set and measure the statistical data characteristics using basic statistics, central tendency measure and entropy based approach. We split this data characteristics using well known decision tree approach to generate the rules. Finally we use the generated rules to select the unique parameter value for rbf kernel and then adopt in K-means algorithm. The experiment has been demonstrated with 112 problems and 10 fold cross validation methods. Finally the proposed algorithm can solve any clustering task very quickly with optimum performance.

Author(s):  
Ali Smith ◽  
Kate A. Smith

The most critical component of kernel based learning algorithms is the choice of an appropriate kernel and its optimal parameters. In this paper we propose a rule based meta-learning approach for automatic radial basis function (rbf) kernel and its parameter selection for Support Vector Machine (SVM) classification. First, the best parameter selection is considered on the basis of prior information of the data with the help of Maximum Likelihood (ML) method and Nelder-Mead (N-M) simplex method. Then the new rule based meta-learning approach is constructed and tested on different sizes of 112 datasets with binary class as well as multi class classification problems. We observe that our rule based methodology provides significant improvement of computational time as well as accuracy in some specific cases.


2008 ◽  
pp. 3308-3323
Author(s):  
Shawkat Ali ◽  
Kate A. Smith

The most critical component of kernel based learning algorithms is the choice of an appropriate kernel and its optimal parameters. In this paper we propose a rule based meta-learning approach for automatic radial basis function (rbf) kernel and its parameter selection for Support Vector Machine (SVM) classification. First, the best parameter selection is considered on the basis of prior information of the data with the help of Maximum Likelihood (ML) method and Nelder-Mead (N-M) simplex method. Then the new rule based meta-learning approach is constructed and tested on different sizes of 112 datasets with binary class as well as multi class classification problems. We observe that our rule based methodology provides significant improvement of computational time as well as accuracy in some specific cases.


2021 ◽  
Author(s):  
Lisa Ferguson ◽  
Victor C. Rentes ◽  
Lauren McCarthy ◽  
Alexandra H. Vinson

2021 ◽  
Vol 12 (4) ◽  
pp. 169-185
Author(s):  
Saida Ishak Boushaki ◽  
Omar Bendjeghaba ◽  
Nadjet Kamel

Clustering is an important unsupervised analysis technique for big data mining. It finds its application in several domains including biomedical documents of the MEDLINE database. Document clustering algorithms based on metaheuristics is an active research area. However, these algorithms suffer from the problems of getting trapped in local optima, need many parameters to adjust, and the documents should be indexed by a high dimensionality matrix using the traditional vector space model. In order to overcome these limitations, in this paper a new documents clustering algorithm (ASOS-LSI) with no parameters is proposed. It is based on the recent symbiotic organisms search metaheuristic (SOS) and enhanced by an acceleration technique. Furthermore, the documents are represented by semantic indexing based on the famous latent semantic indexing (LSI). Conducted experiments on well-known biomedical documents datasets show the significant superiority of ASOS-LSI over five famous algorithms in terms of compactness, f-measure, purity, misclassified documents, entropy, and runtime.


Author(s):  
Yingying Shang

Using server log data to predict the URLs that a user is likely to visit is an important research area in user behavior prediction. In this paper, a predictive model (called LAR) based on the long short-term memory (LSTM) attention network and reciprocal-nearest-neighbors supported clustering algorithm (RSC) for predicting the URL is proposed. First, the LSTM-attention network is used to predict the URL categories a user might visit, and the RSC algorithm is then used to cluster users. Subsequently, the URLs belonging to the same category are determined from the user clusters to predict the URLs that the user might visit. The proposed LAR model considers the time sequence of the user access URL, and the relationship between a single user and group users, which effectively improves the prediction accuracy. The experimental results demonstrate that the LAR model is feasible and effective for user behavior prediction. The accuracy of the mean absolute error and root mean square error of the LAR model are better than those of the other models compared in this study.


Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2949
Author(s):  
Changpeng Li ◽  
Tianhao Peng ◽  
Yanmin Zhu

During operation, the acoustic signal of the drum shearer contains a wealth of information. The monitoring or diagnosis system based on acoustic signal has obvious advantages. However, the signal is challenging to extract and recognize. Therefore, this paper proposes an approach for acoustic signal processing of a shearer based on the parameter optimized variational mode decomposition (VMD) method and a clustering algorithm. First, the particle swarm optimization (PSO) algorithm searched for the best parameter combination of the VMD. According to the results, the approach determined the number of modes and penalty parameters for VMD. Then the improved VMD algorithm decomposed the acoustic signal. It selected the ideal component through the minimum envelope entropy. The PSO was designed to optimize the clustering analysis, and the minimum envelope entropy of the acoustic signal was regarded as the feature for classification. We then use a shearer simulation platform to collect the acoustic signal and use the approach proposed in this paper to process and classify the signal. The experimental results show that the approach proposed can effectively extract the features of the acoustic signal of the shearer. The recognition accuracy of the acoustic signal was high, which has practical application value.


2002 ◽  
pp. 145-152 ◽  
Author(s):  
Fiona Fui-Hoon Nah

The explosive expansion of the World Wide Web (WWW) is the biggest event in the Internet. Since its public introduction in 1991, the WWW has become an important channel for electronic commerce, information access, and publication. However, the long waiting time for accessing web pages has become a critical issue, especially with the popularity of multimedia technology and the exponential increase in the number of Web users. Although various technologies and techniques have been implemented to alleviate the situation and to comfort the impatient users, there is still the need to carry out fundamental research to investigate what constitutes an acceptable waiting time for a typical WWW user. This research not only evaluates Nielsen’s hypothesis of 15 seconds as the maximum waiting time of WWW users, but also provides approximate distributions of waiting time of WWW users.


Author(s):  
Ying Wang ◽  
Weifeng Jiang

To improve the learning effect of online learning, an online learning target automatic classification and clustering analysis algorithm based on cognitive thinking was proposed. It was applied to a multi-dimensional learning community. A new form of virtual learning community concept was proposed. The design ideas of its multi-dimensional learning environment were elaborated. Ontology technology was used to collect student learning process data. A cognitive diagnostic model for assessing student learning status was generated. Finally, through the cluster analysis technology, the registered students in the curriculum center were automatically divided into different levels of community groups. The results showed that the proposed algorithm for automatic classification and clustering of online learning targets had a good application effect in the learning community. Therefore, this method has practical application value.


Sensors ◽  
2019 ◽  
Vol 19 (5) ◽  
pp. 1033 ◽  
Author(s):  
Peng Zhou ◽  
Decheng Zuo ◽  
Kun Hou ◽  
Zhan Zhang ◽  
Jian Dong ◽  
...  

Cyber Physical Systems (CPS) has been a popular research area in the last decade. The dependability of CPS is still a critical issue, and few surveys have been published in this domain. CPS is a dynamic complex system, which involves various multidisciplinary technologies. To avoid human errors and to simplify management, self-management CPS (SCPS) is a wise choice. To achieve dependable self-management, systematic solutions are necessary to verify the design and to guarantee the safety of self-adaptation decisions, as well as to maintain the health of SCPS. This survey first recalls the concepts of dependability, and proposes a generic environment-in-loop processing flow of self-management CPS, and then analyzes the error sources and challenges of self-management through the formal feedback flow. Focusing on reducing the complexity, we first survey the self-adaptive architecture approaches and applied dependability means, then we introduce a hybrid multi-role self-adaptive architecture, and discuss the supporting technologies for dependable self-management at the architecture level. Focus on dependable environment-centered adaptation, we investigate the verification and validation (V&V) methods for making safe self-adaptation decision and the solutions for processing decision dependably. For system-centered adaptation, the comprehensive self-healing methods are summarized. Finally, we analyze the missing pieces of the technology puzzle and the future directions. In this survey, the technical trends for dependable CPS design and maintenance are discussed, an all-in-one solution is proposed to integrate these technologies and build a dependable organic SCPS. To the best of our knowledge, this is the first comprehensive survey on dependable SCPS building and evaluation.


2001 ◽  
Vol 12 (03) ◽  
pp. 307-324
Author(s):  
WEISONG SHI ◽  
ZHIMIN TANG

Load balancing is a critical issue for achieving good performance in parallel and distributed systems. However, this issue is neglected in the research area of software DSMs in the past decade. Based on the observation that scientific applications can be classified into two categories: iterative and non-iterative, we propose two dynamic scheduling schemes for these two cases respectively in this paper. For iterative scientific applications, a dynamic task migration technique is proposed which characterizes itself with integrating computation migration and data migration together. An affinity-based self scheduling (ABS) is proposed for non-iterative scientific applications, which take both the static and dynamic processor affinity into consideration when scheduling. The target experiment platform is a state-of-the-art home-based DSM system named JIAJIA. Performance evaluation results show that the novel task migration scheme improves the performance ranging from 36% to 50% compared with a static task allocation scheme in a metacomputing environment, and performs better than traditional task (computation-only) migration approach about 12.5% for MAT, and 37.5% for SOR and EM3D. Higher resource utilization is achieved via the new task migration scheme too. In comparison with other loop scheduling schemes, the ABS achieves the best performance among all scheduling schemes in a metacomputing environment because of the reduction of synchronization overhead and the great improvement of waiting time resulting from load imbalance.


Sign in / Sign up

Export Citation Format

Share Document