Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering

2016 ◽  
Vol 19 (5) ◽  
pp. 1585-1602 ◽  
Author(s):  
Le Hoang Son ◽  
Nguyen Dang Tien
Author(s):  
B. K. Tripathy ◽  
Hari Seetha ◽  
M. N. Murty

Data clustering plays a very important role in Data mining, machine learning and Image processing areas. As modern day databases have inherent uncertainties, many uncertainty-based data clustering algorithms have been developed in this direction. These algorithms are fuzzy c-means, rough c-means, intuitionistic fuzzy c-means and the means like rough fuzzy c-means, rough intuitionistic fuzzy c-means which base on hybrid models. Also, we find many variants of these algorithms which improve them in different directions like their Kernelised versions, possibilistic versions, and possibilistic Kernelised versions. However, all the above algorithms are not effective on big data for various reasons. So, researchers have been trying for the past few years to improve these algorithms in order they can be applied to cluster big data. The algorithms are relatively few in comparison to those for datasets of reasonable size. It is our aim in this chapter to present the uncertainty based clustering algorithms developed so far and proposes a few new algorithms which can be developed further.


Kybernetes ◽  
2016 ◽  
Vol 45 (8) ◽  
pp. 1273-1291 ◽  
Author(s):  
Runhai Jiao ◽  
Shaolong Liu ◽  
Wu Wen ◽  
Biying Lin

Purpose The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on incremental clustering which divides data into series of data chunks and only a small amount of data need to be clustered at each time. Few researches on incremental clustering algorithm address the problem of optimizing cluster center initialization for each data chunk and selecting multiple passing points for each cluster. Design/methodology/approach Through optimizing initial cluster centers, quality of clustering results is improved for each data chunk and then quality of final clustering results is enhanced. Moreover, through selecting multiple passing points, more accurate information is passed down to improve the final clustering results. The method has been proposed to solve those two problems and is applied in the proposed algorithm based on streaming kernel fuzzy c-means (stKFCM) algorithm. Findings Experimental results show that the proposed algorithm demonstrates more accuracy and better performance than streaming kernel stKFCM algorithm. Originality/value This paper addresses the problem of improving the performance of increment clustering through optimizing cluster center initialization and selecting multiple passing points. The paper analyzed the performance of the proposed scheme and proved its effectiveness.


2017 ◽  
Vol 5 (12) ◽  
pp. 323-325
Author(s):  
E. Mahima Jane ◽  
◽  
◽  
E. George Dharma Prakash Raj

2011 ◽  
Vol 211-212 ◽  
pp. 793-797
Author(s):  
Chin Chun Chen ◽  
Yuan Horng Lin ◽  
Jeng Ming Yih ◽  
Sue Fen Huang

Apply interpretive structural modeling to construct knowledge structure of linear algebra. New fuzzy clustering algorithms improved fuzzy c-means algorithm based on Mahalanobis distance has better performance than fuzzy c-means algorithm. Each cluster of data can easily describe features of knowledge structures individually. The results show that there are six clusters and each cluster has its own cognitive characteristics. The methodology can improve knowledge management in classroom more feasible.


2013 ◽  
Vol 284-287 ◽  
pp. 3537-3542
Author(s):  
Chin Chun Chen ◽  
Yuan Horng Lin ◽  
Jeng Ming Yih

Knowledge Management of Mathematics Concepts was essential in educational environment. The purpose of this study is to provide an integrated method of fuzzy theory basis for individualized concept structure analysis. This method integrates Fuzzy Logic Model of Perception (FLMP) and Interpretive Structural Modeling (ISM). The combined algorithm could analyze individualized concepts structure based on the comparisons with concept structure of expert. Fuzzy clustering algorithms are based on Euclidean distance function, which can only be used to detect spherical structural clusters. A Fuzzy C-Means algorithm based on Mahalanobis distance (FCM-M) was proposed to improve those limitations of GG and GK algorithms, but it is not stable enough when some of its covariance matrices are not equal. A new improved Fuzzy C-Means algorithm based on a Normalized Mahalanobis distance (FCM-NM) is proposed. Use the best performance of clustering Algorithm FCM-NM in data analysis and interpretation. Each cluster of data can easily describe features of knowledge structures. Manage the knowledge structures of Mathematics Concepts to construct the model of features in the pattern recognition completely. This procedure will also useful for cognition diagnosis. To sum up, this integrated algorithm could improve the assessment methodology of cognition diagnosis and manage the knowledge structures of Mathematics Concepts easily.


Author(s):  
P. Tamijiselvy ◽  
N. Kavitha ◽  
K. M. Keerthana ◽  
D. Menakha

The degree of aortic calcification has been appeared to be a risk pointer for vascular occasions including cardiovascular events. The created strategy is fully automated data mining algorithm to segment and measure calcification using Low-dose Chest CT in smokers of age 50 to 70 .The identification of subjects with increased cardiovascular risk can be detected by using data mining algorithms. This paper presents a method for automatic detection of coronary artery calcifications in low-dose chest CT scans using effective clustering algorithms with three phases as Pre-Processing, Segmentation and clustering. Fuzzy C Means algorithm provides accuracy of 80.23% demonstrate that Fuzzy C means detects the Cardio Vascular Disease at early stage.


Sign in / Sign up

Export Citation Format

Share Document