A data clustering algorithm for stratified data partitioning in artificial neural network

2012 ◽  
Vol 39 (8) ◽  
pp. 7004-7014 ◽  
Author(s):  
Ajit K. Sahoo ◽  
Ming J. Zuo ◽  
M.K. Tiwari
2019 ◽  
Vol 23 (1) ◽  
pp. 67-77 ◽  
Author(s):  
Yao Yevenyo Ziggah ◽  
Hu Youjian ◽  
Alfonso Rodrigo Tierra ◽  
Prosper Basommi Laari

The popularity of Artificial Neural Network (ANN) methodology has been growing in a wide variety of areas in geodesy and geospatial sciences. Its ability to perform coordinate transformation between different datums has been well documented in literature. In the application of the ANN methods for the coordinate transformation, only the train-test (hold-out cross-validation) approach has usually been used to evaluate their performance. Here, the data set is divided into two disjoint subsets thus, training (model building) and testing (model validation) respectively. However, one major drawback in the hold-out cross-validation procedure is inappropriate data partitioning. Improper split of the data could lead to a high variance and bias in the results generated. Besides, in a sparse dataset situation, the hold-out cross-validation is not suitable. For these reasons, the K-fold cross-validation approach has been recommended. Consequently, this study, for the first time, explored the potential of using K-fold cross-validation method in the performance assessment of radial basis function neural network and Bursa-Wolf model under data-insufficient situation in Ghana geodetic reference network. The statistical analysis of the results revealed that incorrect data partition could lead to a false reportage on the predictive performance of the transformation model. The findings revealed that the RBFNN and Bursa-Wolf model produced a transformation accuracy of 0.229 m and 0.469 m, respectively. It was also realised that a maximum horizontal error of 0.881 m and 2.131 m was given by the RBFNN and Bursa-Wolf. The obtained results per the cadastral surveying and plan production requirement set by the Ghana Survey and Mapping Division are applicable. This study will contribute to the usage of K-fold cross-validation approach in developing countries having the same sparse dataset situation like Ghana as well as in the geodetic sciences where ANN users seldom apply the statistical resampling technique.


Author(s):  
Se-Hoon Jung ◽  
Jong-Chan Kim ◽  
Chun-Bo Sim

Various types of derivative information have been increasing exponentially, based on mobile devices and social networking sites (SNSs), and the information technologies utilizing them have also been developing rapidly. Technologies to classify and analyze such information are as important as data generation. This study concentrates on data clustering through principal component analysis and K-means algorithms to analyze and classify user data efficiently. We propose a technique of changing the cluster choice before cluster processing in the existing K-means practice into a variable cluster choice through principal component analysis, and expanding the scope of data clustering. The technique also applies an artificial neural network learning model for user recommendation and prediction from the clustered data. The proposed processing model for predicted data generated results that improved the existing artificial neural network–based data clustering and learning model by approximately 9.25%.


2018 ◽  
Vol 27 (2) ◽  
pp. 135-147 ◽  
Author(s):  
Rafath Samrin ◽  
Devara Vasumathi

AbstractDespite the rapid developments in data technology, intruders are among the most revealed threats to security. Network intrusion detection systems are now a typical constituent of network security structures. In this paper, we present a combined weighted K-means clustering algorithm with artificial neural network (WKMC+ANN)-based intrusion identification scheme. This paper comprises two modules: clustering and intrusion detection. The input dataset is gathered into clusters with the usage of WKMC in clustering module. In the intrusion detection module, the clustered information is trained with the utilization of ANN and its structure is stored. In the testing process, the data are tested by choosing the most suitable ANN classifier, which corresponds to the closest cluster to the test data, according to distance or similarity measures. For experimental evaluation, we used the benchmark database, and the results clearly demonstrated that the proposed technique outperformed the existing technique by having better accuracy.


2004 ◽  
Vol 21 (12) ◽  
pp. 2360-2368 ◽  
Author(s):  
Josephine L. P. Soh ◽  
Fei Chen ◽  
Celine V. Liew ◽  
Daming Shi ◽  
Paul W. S. Heng

Author(s):  
Suneetha Chittinen ◽  
Dr. Raveendra Babu Bhogapathi

In this paper, fuzzy c-means algorithm uses neural network algorithm is presented. In pattern recognition, fuzzy clustering algorithms have demonstrated advantage over crisp clustering algorithms to group the high dimensional data into clusters. The proposed work involves two steps. First, a recently developed and Enhanced Kmeans Fast Leaning Artificial Neural Network (KFLANN) frame work is used to determine cluster centers. Secondly, Fuzzy C-means uses these cluster centers to generate fuzzy membership functions. Enhanced K-means Fast Learning Artificial Neural Network (KFLANN) is an algorithm which produces consistent classification of the vectors in to the same clusters regardless of the data presentation sequence. Experiments are conducted on two artificial data sets Iris and New Thyroid. The result shows that Enhanced KFLANN is faster to generate consistent cluster centers and utilizes these for elicitation of efficient fuzzy memberships.


Sign in / Sign up

Export Citation Format

Share Document