NS-DBSCAN: A Density-Based Clustering Algorithm in Network Space

Spatial clustering analysis is an important spatial data mining technique. It divides objects into clusters according to their similarities in both location and attribute aspects. It plays an essential role in density distribution identification, hot-spot detection, and trend discovery. Spatial clustering algorithms in the Euclidean space are relatively mature, while those in the network space are less well researched. This study aimed to present a well-known clustering algorithm, named density-based spatial clustering of applications with noise (DBSCAN), to network space and proposed a new clustering algorithm named network space DBSCAN (NS-DBSCAN). Basically, the NS-DBSCAN algorithm used a strategy similar to the DBSCAN algorithm. Furthermore, it provided a new technique for visualizing the density distribution and indicating the intrinsic clustering structure. Tested by the points of interest (POI) in Hanyang district, Wuhan, China, the NS-DBSCAN algorithm was able to accurately detect the high-density regions. The NS-DBSCAN algorithm was compared with the classical hierarchical clustering algorithm and the recently proposed density-based clustering algorithm with network-constraint Delaunay triangulation (NC_DT) in terms of their effectiveness. The hierarchical clustering algorithm was effective only when the cluster number was well specified, otherwise it might separate a natural cluster into several parts. The NC_DT method excessively gathered most objects into a huge cluster. Quantitative evaluation using four indicators, including the silhouette, the R-squared index, the Davis–Bouldin index, and the clustering scheme quality index, indicated that the NS-DBSCAN algorithm was superior to the hierarchical clustering and NC_DT algorithms.

Download Full-text

A method for efficient clustering of spatial data in network space

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202806 ◽

2021 ◽

pp. 1-18

Author(s):

Trang T.D. Nguyen ◽

Loan T.T. Nguyen ◽

Anh Nguyen ◽

Unil Yun ◽

Bay Vo

Keyword(s):

Spatial Data ◽

Euclidean Distance ◽

Clustering Algorithm ◽

Spatial Clustering ◽

Spatial Data Mining ◽

Spatial Data Analysis ◽

Clustering Methods ◽

Dbscan Algorithm ◽

Density Based Clustering ◽

Network Space

Spatial clustering is one of the main techniques for spatial data mining and spatial data analysis. However, existing spatial clustering methods primarily focus on points distributed in planar space with the Euclidean distance measurement. Recently, NS-DBSCAN has been developed to perform clustering of spatial point events in Network Space based on a well-known clustering algorithm, named Density-Based Spatial Clustering of Applications with Noise (DBSCAN). The NS-DBSCAN algorithm has efficiently solved the problem of clustering network constrained spatial points. When compared to the NC_DT (Network-Constraint Delaunay Triangulation) clustering algorithm, the NS-DBSCAN algorithm efficiently solves the problem of clustering network constrained spatial points by visualizing the intrinsic clustering structure of spatial data by constructing density ordering charts. However, the main drawback of this algorithm is when the data are processed, objects that are not specifically categorized into types of clusters cannot be removed, which is undeniably a waste of time, particularly when the dataset is large. In an attempt to have this algorithm work with great efficiency, we thus recommend removing edges that are longer than the threshold and eliminating low-density points from the density ordering table when forming clusters and also take other effective techniques into consideration. In this paper, we develop a theorem to determine the maximum length of an edge in a road segment. Based on this theorem, an algorithm is proposed to greatly improve the performance of the density-based clustering algorithm in network space (NS-DBSCAN). Experiments using our proposed algorithm carried out in collaboration with Ho Chi Minh City, Vietnam yield the same results but shows an advantage of it over NS-DBSCAN in execution time.

Download Full-text

Redrawing hot spots of crime in Dallas, Texas

10.31235/osf.io/nmq8r ◽

2020 ◽

Author(s):

Andrew Palmer Wheeler ◽

Sydney Reuter

Keyword(s):

Hierarchical Clustering ◽

Hot Spots ◽

Clustering Algorithm ◽

Predictive Accuracy ◽

Hot Spot ◽

Police Department ◽

Enforcement Cost ◽

Cost Of Crime ◽

Hierarchical Clustering Algorithm ◽

Street Segments

In this work we evaluate the predictive capability of identifying long term, micro place hot spots in Dallas, Texas. We create hot spots using a hierarchical clustering algorithm, using law enforcement cost of crime estimates as weights. Relative to the much larger current hot spot areas defined by the Dallas Police Department, our identified hot spots are much smaller (under 3 square miles), and capture crime harm at a higher density per the Predictive Accuracy Index statistic. We also show that the hierarchical clustering algorithm captures a wide array of hot spot types; some one or two addresses, some street segments, and others an agglomeration of larger areas. This suggests identifying hot spots based on a specific unit of aggregation (e.g. addresses, street segments), may be less efficient than using a hierarchical clustering technique in practice. Code and data to reproduce the analysis can be downloaded from https://www.dropbox.com/sh/kcask6pinaaaz4v/AAC4CXk6NzUweyld2n4OznzWa?dl=0

Download Full-text

Handling WSD using Hierarchical Clustering Algorithm with sentences

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset1841120 ◽

2018 ◽

pp. 83-88

Author(s):

Mohana Priya K ◽

Pooja Ragavi S ◽

Krishna Priya G

Keyword(s):

Hierarchical Clustering ◽

Similarity Measure ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Cosine Similarity Measure ◽

Hierarchical Clustering Algorithm ◽

Multiple Levels ◽

Pos Tagger ◽

Sentence Clustering ◽

The Right

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%

Download Full-text

Based on a hierarchical clustering algorithm: detecting the community structure of a resting state brain network

Future Computer and Information Technology ◽

10.2495/icfcit130781 ◽

2013 ◽

Author(s):

Wenzhao Liu ◽

Limin Niu ◽

Junjie Chen

Keyword(s):

Community Structure ◽

Hierarchical Clustering ◽

Resting State ◽

Clustering Algorithm ◽

Brain Network ◽

Hierarchical Clustering Algorithm

Download Full-text

Evaluation of students programming skills on a computer programming course with a hierarchical clustering algorithm

2020 IEEE Frontiers in Education Conference (FIE) ◽

10.1109/fie44824.2020.9274130 ◽

2020 ◽

Author(s):

Davi Bernardo Silva ◽

Carlos N. Silla

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Computer Programming ◽

Programming Skills ◽

Hierarchical Clustering Algorithm

Download Full-text

Hesitant Fuzzy Linguistic Agglomerative Hierarchical Clustering Algorithm and Its Application in Judicial Practice

Mathematics ◽

10.3390/math9040370 ◽

2021 ◽

Vol 9 (4) ◽

pp. 370

Author(s):

Shuangsheng Wu ◽

Jie Lin ◽

Zhenyu Zhang ◽

Yushu Yang

Keyword(s):

Hierarchical Clustering ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Agglomerative Hierarchical Clustering ◽

Research Gaps ◽

Judicial Practice ◽

Linguistic Term ◽

Clustering Effect ◽

Hierarchical Clustering Algorithm ◽

Fuzzy Linguistic

The fuzzy clustering algorithm has become a research hotspot in many fields because of its better clustering effect and data expression ability. However, little research focuses on the clustering of hesitant fuzzy linguistic term sets (HFLTSs). To fill in the research gaps, we extend the data type of clustering to hesitant fuzzy linguistic information. A kind of hesitant fuzzy linguistic agglomerative hierarchical clustering algorithm is proposed. Furthermore, we propose a hesitant fuzzy linguistic Boole matrix clustering algorithm and compare the two clustering algorithms. The proposed clustering algorithms are applied in the field of judicial execution, which provides decision support for the executive judge to determine the focus of the investigation and the control. A clustering example verifies the clustering algorithm’s effectiveness in the context of hesitant fuzzy linguistic decision information.

Download Full-text