Analysis of Fuzzy Clustering for the Adoption in Data Mining

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.2047 ◽

2014 ◽

Vol 989-994 ◽

pp. 2047-2050

Author(s):

Ying Jie Wang

Keyword(s):

Machine Learning ◽

Data Mining ◽

Mathematical Method ◽

Big Data ◽

Fuzzy Clustering ◽

Clustering Analysis ◽

Data Clustering ◽

Clustering Algorithm ◽

Clustering Methods ◽

Fuzzy Clustering Methods

Data mining is the general methodology for retrieving useful information from big data. Clustering analysis is a mathematical method of classification for unsupervised machine learning. It can be adopted for data classification in Data mining. This paper combines the clustering process by fuzzy way and then deduces a special clustering algorithm with fast fuzzy c-means (FFCM) method. In summary, the paper illustrates the adoption of a series of fuzzy clustering methods in Data Mining. These methods have improved the computational efficiency with learning as the convergence speed is fast. The methodology of this paper presents significantly meaningful for information retrieval of big data.

Download Full-text

ONLINE PROBABILISTIC FUZZY CLUSTERING METHOD BASED ON EVOLUTIONARY OPTIMIZATION OF CAT SWARM

Radio Electronics Computer Science Control ◽

10.15588/1607-3274-2021-2-7 ◽

2021 ◽

pp. 65-70

Author(s):

Ye. V. Bodyanskiy ◽

A. Yu. Shafronenko ◽

I. N. Klymova

Keyword(s):

Big Data ◽

Fuzzy Clustering ◽

Data Clustering ◽

Clustering Algorithm ◽

Evolutionary Optimization ◽

Clustering Methods ◽

Classification Problems ◽

Probabilistic Data ◽

Fuzzy Clustering Method ◽

Clustering And Classification

Context. The problems of big data clustering today is a very relevant area of artificial intelligence. This task is often found in many applications related to data mining, deep learning, etc. To solve these problems, traditional approaches and methods require that the entire data sample be submitted in batch form. Objective. The aim of the work is to propose a method of fuzzy probabilistic data clustering using evolutionary optimization of cat swarm, that would be devoid of the drawbacks of traditional data clustering approaches. Method. The procedure of fuzzy probabilistic data clustering using evolutionary algorithms, for faster determination of sample extrema, cluster centroids and adaptive functions, allowing not to spend machine resources for storing intermediate calculations and do not require additional time to solve the problem of data clustering, regardless of the dimension and the method of presentation for processing. Results. The proposed data clustering algorithm based on evolutionary optimization is simple in numerical implementation, is devoid of the drawbacks inherent in traditional fuzzy clustering methods and can work with a large size of input information processed online in real time. Conclusions. The results of the experiment allow to recommend the developed method for solving the problems of automatic clustering and classification of big data, as quickly as possible to find the extrema of the sample, regardless of the method of submitting the data for processing. The proposed method of online probabilistic fuzzy data clustering based on evolutionary optimization of cat swarm is intended for use in hybrid computational intelligence systems, neuro-fuzzy systems, in training artificial neural networks, in clustering and classification problems.

Download Full-text

Big Data Clustering Analysis Algorithm for Internet of Things Based on K-Means

International Journal of Distributed Systems and Technologies ◽

10.4018/ijdst.2019010101 ◽

2019 ◽

Vol 10 (1) ◽

pp. 1-12 ◽

Cited By ~ 2

Author(s):

Zhanqiu Yu

Keyword(s):

Big Data ◽

Internet Of Things ◽

Clustering Analysis ◽

Data Clustering ◽

Clustering Algorithm ◽

Prototype System ◽

Point Selection ◽

Logistics System ◽

Relational Schema ◽

Analysis Algorithm

To explore the Internet of things logistics system application, an Internet of things big data clustering analysis algorithm based on K-mans was discussed. First of all, according to the complex event relation and processing technology, the big data processing of Internet of things was transformed into the extraction and analysis of complex relational schema, so as to provide support for simplifying the processing complexity of big data in Internet of things (IOT). The traditional K-means algorithm was optimized and improved to make it fit the demand of big data RFID data network. Based on Hadoop cloud cluster platform, a K-means cluster analysis was achieved. In addition, based on the traditional clustering algorithm, a center point selection technology suitable for RFID IOT data clustering was selected. The results showed that the clustering efficiency was improved to some extent. As a result, an RFID Internet of things clustering analysis prototype system is designed and realized, which further tests the feasibility.

Download Full-text

An Ordinal Data Clustering Algorithm with Automated Distance Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6168 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6869-6876

Author(s):

Yiqun Zhang ◽

Yiu-ming Cheung

Keyword(s):

Machine Learning ◽

Data Mining ◽

Categorical Data ◽

Data Clustering ◽

Time Complexity ◽

Ordinal Data ◽

Clustering Algorithm ◽

Major Type ◽

Common Task ◽

Consecutive Integers

Clustering ordinal data is a common task in data mining and machine learning fields. As a major type of categorical data, ordinal data is composed of attributes with naturally ordered possible values (also called categories interchangeably in this paper). However, due to the lack of dedicated distance metric, ordinal categories are usually treated as nominal ones, or coded as consecutive integers and treated as numerical ones. Both these two common ways will roughly define the distances between ordinal categories because the former way ignores the order relationship and the latter way simply assigns identical distances to different pairs of adjacent categories that may have intrinsically unequal distances. As a result, they may produce unsatisfactory ordinal data clustering results. This paper, therefore, proposes a novel ordinal data clustering algorithm, which iteratively learns: 1) The partition of ordinal dataset, and 2) the inter-category distances. To the best of our knowledge, this is the first attempt to dynamically adjust inter-category distances during the clustering process to search for a better partition of ordinal data. The proposed algorithm features superior clustering accuracy, low time complexity, fast convergence, and is parameter-free. Extensive experiments show its efficacy.

Download Full-text

Fuzzy Clustering Methods in Data Mining: A Comparative Case Analysis

2008 International Conference on Advanced Computer Theory and Engineering ◽

10.1109/icacte.2008.199 ◽

2008 ◽

Cited By ~ 5

Author(s):

G. Raju ◽

Binu Thomas ◽

Sonam Tobgay ◽

Shanta Kumar

Keyword(s):

Data Mining ◽

Fuzzy Clustering ◽

Case Analysis ◽

Clustering Methods ◽

Fuzzy Clustering Methods

Download Full-text

The Modeling and Simulation of Data Clustering Algorithms in Data Mining with Big Data

Journal of Industrial Integration and Management ◽

10.1142/s2424862218500173 ◽

2019 ◽

Vol 04 (01) ◽

pp. 1850017 ◽

Cited By ~ 3

Author(s):

Weiru Chen ◽

Jared Oliverio ◽

Jin Ho Kim ◽

Jiayue Shen

Keyword(s):

Data Mining ◽

Big Data ◽

Data Reduction ◽

Data Clustering ◽

Clustering Algorithms ◽

High Volume ◽

Clustering Methods ◽

Data Set ◽

Processing Methods ◽

Integration Data

Big Data is a popular cutting-edge technology nowadays. Techniques and algorithms are expanding in different areas including engineering, biomedical, and business. Due to the high-volume and complexity of Big Data, it is necessary to conduct data pre-processing methods when data mining. The pre-processing methods include data cleaning, data integration, data reduction, and data transformation. Data clustering is the most important step of data reduction. With data clustering, mining on the reduced data set should be more efficient yet produce quality analytical results. This paper presents the different data clustering methods and related algorithms for data mining with Big Data. Data clustering can increase the efficiency and accuracy of data mining.

Download Full-text

Analysis of Web Log Data Mining Based on Improved Fuzzy Clustering Algorithm

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.760-762.1896 ◽

2013 ◽

Vol 760-762 ◽

pp. 1896-1901 ◽

Cited By ~ 1

Author(s):

Chuan Qi Chen

Keyword(s):

Data Mining ◽

Best Practices ◽

Fuzzy Clustering ◽

Clustering Analysis ◽

Clustering Algorithm ◽

Pattern Mining ◽

Log Data ◽

Web Log ◽

Fuzzy Clustering Analysis ◽

Fuzzy Clustering Algorithm

Fuzzy clustering analysis is a clustering algorithm based on function best practices, technology and optimal cost function using calculus. Fuzzy clustering, each sample is no longer belong to a class, but belong to a certain degree of membership of each class. In this paper, Web log sequential pattern mining knowledge gained, and visitors have the same browsing mode access to cutting the interaction of users with the Web information space. The paper presents analysis of Web log data mining based on improved fuzzy clustering algorithm. The experiment demonstrates the improved algorithm has better scalability.

Download Full-text

Bezdek-Type Fuzzified Co-Clustering Algorithm

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2015.p0852 ◽

2015 ◽

Vol 19 (6) ◽

pp. 852-860 ◽

Cited By ~ 10

Author(s):

Yuchi Kanzawa ◽

Keyword(s):

Fuzzy Clustering ◽

Spectral Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Clustering Methods ◽

Suitable Parameter ◽

Fuzzy Clustering Methods ◽

Clustering Approach ◽

Parameter Values ◽

Vectorial Data

In this study, two co-clustering algorithms based on Bezdek-type fuzzification of fuzzy clustering are proposed for categorical multivariate data. The two proposed algorithms are motivated by the fact that there are only two fuzzy co-clustering methods currently available – entropy regularization and quadratic regularization – whereas there are three fuzzy clustering methods for vectorial data: entropy regularization, quadratic regularization, and Bezdek-type fuzzification. The first proposed algorithm forms the basis of the second algorithm. The first algorithm is a variant of a spherical clustering method, with the kernelization of a maximizing model of Bezdek-type fuzzy clustering with multi-medoids. By interpreting the first algorithm in this way, the second algorithm, a spectral clustering approach, is obtained. Numerical examples demonstrate that the proposed algorithms can produce satisfactory results when suitable parameter values are selected.

Download Full-text

Clustering Methods Using Distance-Based Similarity Measures of Single-Valued Neutrosophic Sets

Journal of Intelligent Systems ◽

10.1515/jisys-2013-0091 ◽

2014 ◽

Vol 23 (4) ◽

pp. 379-389 ◽

Cited By ~ 45

Author(s):

Jun Ye

Keyword(s):

Machine Learning ◽

Data Mining ◽

Fuzzy Sets ◽

Clustering Algorithm ◽

Distance Measure ◽

Similarity Measures ◽

Intuitionistic Fuzzy Sets ◽

Clustering Methods ◽

Neutrosophic Sets ◽

Generalized Distance

AbstractClustering plays an important role in data mining, pattern recognition, and machine learning. Single-valued neutrosophic sets (SVNSs) are useful means to describe and handle indeterminate and inconsistent information that fuzzy sets and intuitionistic fuzzy sets cannot describe and deal with. To cluster the data represented by single-valued neutrosophic information, this article proposes single-valued neutrosophic clustering methods based on similarity measures between SVNSs. First, we define a generalized distance measure between SVNSs and propose two distance-based similarity measures of SVNSs. Then, we present a clustering algorithm based on the similarity measures of SVNSs to cluster single-valued neutrosophic data. Finally, an illustrative example is given to demonstrate the application and effectiveness of the developed clustering methods.

Download Full-text

New algorithm for clustering unlabeled big data

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v24.i2.pp1054-1062 ◽

2021 ◽

Vol 24 (2) ◽

pp. 1054

Author(s):

Marwan B. Mohammed ◽

Wafaa AL-Hameed

Keyword(s):

Data Mining ◽

Big Data ◽

Hierarchical Clustering ◽

Clustering Analysis ◽

Clustering Algorithm ◽

Unlabeled Data ◽

New Techniques ◽

Clustering Techniques ◽

Analysis Techniques ◽

Hierarchical Clustering Algorithm

The clustering analysis techniques play an important role in the area of data mining. Although from existence several clustering techniques. However, it still to their tries to improve the clustering process efficiently or propose new techniques seeks to allocate objects into clusters so that two objects in the same cluster are more similar than two objects in different clusters and careful not to duplicate the same objects in different groups with the ability to cover all data as much as possible. This paper presents two directions. The first is to propose a new algorithm that coined a name (MB Algorithm) to collect unlabeled data and put them into appropriate groups. The second is the creation of a lexical sequence sentence (LCS) based on similar semantic sentences which are different from the traditional lexical word chain (LCW) based on words. The results showed that the performance of the MB algorithm has generally outperformed the two algorithms the hierarchical clustering algorithm and the K-mean algorithm.

Download Full-text