Tune Up Fuzzy C-Means for Big Data: Some Novel Hybrid Clustering Algorithms Based on Initial Selection and Incremental Clustering

Data clustering plays a very important role in Data mining, machine learning and Image processing areas. As modern day databases have inherent uncertainties, many uncertainty-based data clustering algorithms have been developed in this direction. These algorithms are fuzzy c-means, rough c-means, intuitionistic fuzzy c-means and the means like rough fuzzy c-means, rough intuitionistic fuzzy c-means which base on hybrid models. Also, we find many variants of these algorithms which improve them in different directions like their Kernelised versions, possibilistic versions, and possibilistic Kernelised versions. However, all the above algorithms are not effective on big data for various reasons. So, researchers have been trying for the past few years to improve these algorithms in order they can be applied to cluster big data. The algorithms are relatively few in comparison to those for datasets of reasonable size. It is our aim in this chapter to present the uncertainty based clustering algorithms developed so far and proposes a few new algorithms which can be developed further.

Download Full-text

Incremental kernel fuzzy c-means with optimizing cluster center initialization and delivery

Kybernetes ◽

10.1108/k-08-2015-0209 ◽

2016 ◽

Vol 45 (8) ◽

pp. 1273-1291 ◽

Cited By ~ 1

Author(s):

Runhai Jiao ◽

Shaolong Liu ◽

Wu Wen ◽

Biying Lin

Keyword(s):

Clustering Algorithm ◽

Clustering Algorithms ◽

Cluster Center ◽

Accurate Information ◽

Incremental Clustering ◽

Data Set ◽

Content Type ◽

Fuzzy C Means ◽

Initial Cluster

Purpose The large volume of big data makes it impractical for traditional clustering algorithms which are usually designed for entire data set. The purpose of this paper is to focus on incremental clustering which divides data into series of data chunks and only a small amount of data need to be clustered at each time. Few researches on incremental clustering algorithm address the problem of optimizing cluster center initialization for each data chunk and selecting multiple passing points for each cluster. Design/methodology/approach Through optimizing initial cluster centers, quality of clustering results is improved for each data chunk and then quality of final clustering results is enhanced. Moreover, through selecting multiple passing points, more accurate information is passed down to improve the final clustering results. The method has been proposed to solve those two problems and is applied in the proposed algorithm based on streaming kernel fuzzy c-means (stKFCM) algorithm. Findings Experimental results show that the proposed algorithm demonstrates more accuracy and better performance than streaming kernel stKFCM algorithm. Originality/value This paper addresses the problem of improving the performance of increment clustering through optimizing cluster center initialization and selecting multiple passing points. The paper analyzed the performance of the proposed scheme and proved its effectiveness.

Download Full-text

Survey on Partition based Clustering Algorithms in Big Data

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v5i12.323325 ◽

2017 ◽

Vol 5 (12) ◽

pp. 323-325

Author(s):

E. Mahima Jane ◽

◽

E. George Dharma Prakash Raj

Keyword(s):

Big Data ◽

Clustering Algorithms

Download Full-text

Comparing SOM neural network with Fuzzy c-means, K-means and traditional hierarchical clustering algorithms

European Journal of Operational Research ◽

10.1016/j.ejor.2005.03.039 ◽

2006 ◽

Vol 174 (3) ◽

pp. 1742-1759 ◽

Cited By ~ 150

Author(s):

Sueli A. Mingoti ◽

Joab O. Lima

Keyword(s):

Neural Network ◽

Hierarchical Clustering ◽

Clustering Algorithms ◽

Fuzzy C Means ◽

Som Neural Network

Download Full-text

Efficient Implementation of the Fuzzy c-Means Clustering Algorithms

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.1986.4767778 ◽

1986 ◽

Vol PAMI-8 (2) ◽

pp. 248-255 ◽

Cited By ~ 372

Author(s):

Robert L. Cannon ◽

Jitendra V. Dave ◽

James C. Bezdek

Keyword(s):

Clustering Algorithms ◽

Efficient Implementation ◽

Fuzzy C Means ◽

Fuzzy C Means Clustering

Download Full-text

Construct Knowledge Structure of Linear Algebra

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.211-212.793 ◽

2011 ◽

Vol 211-212 ◽

pp. 793-797

Author(s):

Chin Chun Chen ◽

Yuan Horng Lin ◽

Jeng Ming Yih ◽

Sue Fen Huang

Keyword(s):

Knowledge Management ◽

Linear Algebra ◽

Fuzzy Clustering ◽

Mahalanobis Distance ◽

Clustering Algorithms ◽

Knowledge Structure ◽

Interpretive Structural Modeling ◽

Cognitive Characteristics ◽

Fuzzy C Means ◽

Fuzzy C Means Algorithm

Apply interpretive structural modeling to construct knowledge structure of linear algebra. New fuzzy clustering algorithms improved fuzzy c-means algorithm based on Mahalanobis distance has better performance than fuzzy c-means algorithm. Each cluster of data can easily describe features of knowledge structures individually. The results show that there are six clusters and each cluster has its own cognitive characteristics. The methodology can improve knowledge management in classroom more feasible.

Download Full-text

A review on density-based clustering algorithms for big data analysis

2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) ◽

10.1109/i-smac.2017.8058322 ◽

2017 ◽

Cited By ~ 4

Author(s):

K. Shyam Sunder Reddy ◽

C. Shoba Bindu

Keyword(s):

Big Data ◽

Data Analysis ◽

Clustering Algorithms ◽

Big Data Analysis ◽

Density Based Clustering

Download Full-text

Management of Abstract Algebra Concepts Based on Knowledge Structure

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.284-287.3537 ◽

2013 ◽

Vol 284-287 ◽

pp. 3537-3542

Author(s):

Chin Chun Chen ◽

Yuan Horng Lin ◽

Jeng Ming Yih

Keyword(s):

Mahalanobis Distance ◽

Clustering Algorithm ◽

Fuzzy Theory ◽

Clustering Algorithms ◽

Knowledge Structures ◽

Interpretive Structural Modeling ◽

Fuzzy C Means ◽

Integrated Method ◽

Combined Algorithm ◽

Fuzzy C Means Algorithm

Knowledge Management of Mathematics Concepts was essential in educational environment. The purpose of this study is to provide an integrated method of fuzzy theory basis for individualized concept structure analysis. This method integrates Fuzzy Logic Model of Perception (FLMP) and Interpretive Structural Modeling (ISM). The combined algorithm could analyze individualized concepts structure based on the comparisons with concept structure of expert. Fuzzy clustering algorithms are based on Euclidean distance function, which can only be used to detect spherical structural clusters. A Fuzzy C-Means algorithm based on Mahalanobis distance (FCM-M) was proposed to improve those limitations of GG and GK algorithms, but it is not stable enough when some of its covariance matrices are not equal. A new improved Fuzzy C-Means algorithm based on a Normalized Mahalanobis distance (FCM-NM) is proposed. Use the best performance of clustering Algorithm FCM-NM in data analysis and interpretation. Each cluster of data can easily describe features of knowledge structures. Manage the knowledge structures of Mathematics Concepts to construct the model of features in the pattern recognition completely. This procedure will also useful for cognition diagnosis. To sum up, this integrated algorithm could improve the assessment methodology of cognition diagnosis and manage the knowledge structures of Mathematics Concepts easily.

Download Full-text

ACO-FFDP in incremental clustering for big data analysis

Proceedings of the 3rd International Conference on Smart City Applications - SCA '18 ◽

10.1145/3286606.3286782 ◽

2018 ◽

Author(s):

Fadwa Bouhafer ◽

Mohammed Heyouni ◽

Anass El Haddadi ◽

Zakaria Boulouard

Keyword(s):

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Incremental Clustering

Download Full-text

An Efficient Clustering Approach for Automatic Detection of Calcification in Low Dose Chest CT

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195231 ◽

2019 ◽

pp. 163-168

Author(s):

P. Tamijiselvy ◽

N. Kavitha ◽

K. M. Keerthana ◽

D. Menakha

Keyword(s):

Data Mining ◽

Low Dose ◽

Early Stage ◽

Clustering Algorithms ◽

Automatic Detection ◽

Chest Ct ◽

Data Mining Algorithm ◽

Fuzzy C Means ◽

Data Mining Algorithms ◽

Using Data

The degree of aortic calcification has been appeared to be a risk pointer for vascular occasions including cardiovascular events. The created strategy is fully automated data mining algorithm to segment and measure calcification using Low-dose Chest CT in smokers of age 50 to 70 .The identification of subjects with increased cardiovascular risk can be detected by using data mining algorithms. This paper presents a method for automatic detection of coronary artery calcifications in low-dose chest CT scans using effective clustering algorithms with three phases as Pre-Processing, Segmentation and clustering. Fuzzy C Means algorithm provides accuracy of 80.23% demonstrate that Fuzzy C means detects the Cardio Vascular Disease at early stage.

Download Full-text