Chapter 12: Grid-Based Clustering Algorithms

Abstract Data clustering is an important method used to discover naturally occurring structures in datasets. One of the most popular approaches is the grid-based concept of clustering algorithms. This kind of method is characterized by a fast processing time and it can also discover clusters of arbitrary shapes in datasets. These properties allow these methods to be used in many different applications. Researchers have created many versions of the clustering method using the grid-based approach. However, the key issue is the right choice of the number of grid cells. This paper proposes a novel grid-based algorithm which uses a method for an automatic determining of the number of grid cells. This method is based on the kdist function which computes the distance between each element of a dataset and its kth nearest neighbor. Experimental results have been obtained for several different datasets and they confirm a very good performance of the newly proposed method.

Download Full-text

Comparative Analysis of Clustering Techniques in Cloud For Effective Load Balancing

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i3.4.14674 ◽

2018 ◽

Vol 7 (3.4) ◽

pp. 47 ◽

Cited By ~ 2

Author(s):

Akankshya Aparajita ◽

Shrabanee Swagatika ◽

Debabrata Singh

Keyword(s):

Time Complexity ◽

Clustering Algorithms ◽

Large Datasets ◽

Data Abstraction ◽

Computing Environment ◽

Validation Data ◽

Cloud Computing Environment ◽

Clustering Techniques ◽

Detailed Review ◽

Grid Based

Clustering is used as an important procedure in the process of data mining, where information of large datasets are transformed into meaningful and concise data. It performs activities like pattern representation, using of clustering algorithms and their validation, data abstraction and finally result generated. Clustering has many categories of algorithms such as partition-based, hierarchical-based, density-based, grid-based etc. Partition-based is the centroid-based clustering. Hierarchical-based clustering is link-based. Density-based is clustering is focused on area of higher density in the dataset. Grid-based clustering relies on size of the grid. In this paper, we discussed different clustering techniques as well as, a detailed review on the partition-based and hierarchical-based algorithms. Finally we compare clustering algorithms on the basis of attributes like time complexity, capacity of handling large datasets, scalability, sensitivity to outliers and noise, and also discussed result after solving a particular dataset implemented in cloud computing environment.

Download Full-text

On Density-Based Clustering Algorithms over Evolving Data Streams: A Summarization Paradigm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.263-266.2234 ◽

2012 ◽

Vol 263-266 ◽

pp. 2234-2237 ◽

Cited By ~ 1

Author(s):

Amineh Amini ◽

Teh Ying Wah

Keyword(s):

Data Streams ◽

Arbitrary Shape ◽

Clustering Algorithms ◽

Quality Metrics ◽

Algorithm Performance ◽

Density Based Clustering ◽

Clustering Quality ◽

Clustering Data ◽

Mining Data Streams ◽

Grid Based

Clustering is one of the prominent classes in the mining data streams. Among various clustering algorithms that have been developed, density-based method has the ability to discover arbitrary shape clusters, and to detect the outliers. Recently, various algorithms adopted density-based methods for clustering data streams. In this paper, we look into three remarkable algorithms in two groups of micro-clustering and grid-based including DenStream, D-Stream, and MR-Stream. We compare the algorithms based on evaluating algorithm performance and clustering quality metrics.

Download Full-text

Data Clustering

Handbook of Research on Innovations in Database Technologies and Applications ◽

10.4018/978-1-60566-242-8.ch060 ◽

2009 ◽

pp. 562-572

Author(s):

Yanchang Zhao ◽

Longbing Cao ◽

Huaifeng Zhang ◽

Chengqi Zhang

Keyword(s):

Hierarchical Clustering ◽

Data Clustering ◽

Clustering Algorithms ◽

Future Trends ◽

Clustering Techniques ◽

Density Based Clustering ◽

Data Stream Clustering ◽

Semisupervised Clustering ◽

Definition Of ◽

Grid Based

Clustering is one of the most important techniques in data mining. This chapter presents a survey of popular approaches for data clustering, including well-known clustering techniques, such as partitioning clustering, hierarchical clustering, density-based clustering and grid-based clustering, and recent advances in clustering, such as subspace clustering, text clustering and data stream clustering. The major challenges and future trends of data clustering will also be introduced in this chapter. The remainder of this chapter is organized as follows. The background of data clustering will be introduced in Section 2, including the definition of clustering, categories of clustering techniques, features of good clustering algorithms, and the validation of clustering. Section 3 will present main approaches for clustering, which range from the classic partitioning and hierarchical clustering to recent approaches of bi-clustering and semisupervised clustering. Challenges and future trends will be discussed in Section 4, followed by the conclusions in the last section.

Download Full-text

12. Grid-Based Clustering Algorithms

Data Clustering: Theory, Algorithms, and Applications ◽

10.1137/1.9780898718348.ch12 ◽

2007 ◽

pp. 209-217

Keyword(s):

Clustering Algorithms ◽

Grid Based

Download Full-text

Fast Grid-Based Scan Statistic for Detection of Significant Spatial Disease Clusters

PsycEXTRA Dataset ◽

10.1037/e307182005-065 ◽

2004 ◽

Cited By ~ 7

Author(s):

Daniel B. Neill ◽

A. Moore

Keyword(s):

Scan Statistic ◽

Disease Clusters ◽

Grid Based

Download Full-text

Handling WSD using Hierarchical Clustering Algorithm with sentences

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841120 ◽

2018 ◽

pp. 83-88

Author(s):

Mohana Priya K ◽

Pooja Ragavi S ◽

Krishna Priya G

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Cosine Similarity Measure ◽

Hierarchical Clustering Algorithm ◽

Multiple Levels ◽

Pos Tagger ◽

Sentence Clustering ◽

The Right

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text

Variability Analysis of Climate Extreme Index using Downscaled Multi-models and Grid-based CMIP5 Climate Change Scenario Data

Journal of Climate Change Research ◽

10.15531/ksccr.2020.11.2.123 ◽

2020 ◽

Vol 11 (2) ◽

pp. 123-132

Author(s):

Jae-Pil Cho ◽

Jae-Uk Kim ◽

Soon-Kun Choi ◽

Sye-Woon Hwang ◽

Hui-Cheul Jung

Keyword(s):

Climate Change ◽

Climate Change Scenario ◽

Extreme Index ◽

Change Scenario ◽

Climate Extreme ◽

Variability Analysis ◽

Grid Based

Download Full-text

Reciprocal Polarizable Embedding with a Transferable H2O Potential Function I: Formulation & Tests on Dimer

10.26434/chemrxiv.9206417 ◽

2019 ◽

Author(s):

Elvar Jónsson ◽

Asmus Ougaard Dohn ◽

Hannes Jonsson

Keyword(s):

Potential Function ◽

Dft Method ◽

Multipole Expansion ◽

Energy Functional ◽

Real Space ◽

Functional Formulation ◽

General Energy ◽

Projector Augmented Wave ◽

Grid Based ◽

Polarizable Embedding

This work describes a general energy functional formulation of a polarizable embedding QM/MM scheme, as well as an implementation where a real-space Grid-based Projector Augmented Wave (GPAW) DFT method is coupled with a potential function for H<sub>2</sub>O based on a Single Center Multipole Expansion (SCME) of the electrostatics, including anisotropic dipole and quadrupole polarizability.

Download Full-text

Chapter 12: Grid-Based Clustering Algorithms

A study of density-grid based clustering algorithms on data streams

A Novel Grid-Based Clustering Algorithm

Comparative Analysis of Clustering Techniques in Cloud For Effective Load Balancing

On Density-Based Clustering Algorithms over Evolving Data Streams: A Summarization Paradigm

Data Clustering

12. Grid-Based Clustering Algorithms

Fast Grid-Based Scan Statistic for Detection of Significant Spatial Disease Clusters

Handling WSD using Hierarchical Clustering Algorithm with sentences

Variability Analysis of Climate Extreme Index using Downscaled Multi-models and Grid-based CMIP5 Climate Change Scenario Data

Reciprocal Polarizable Embedding with a Transferable H2O Potential Function I: Formulation & Tests on Dimer

Export Citation Format