Design of Library User Profile System Based on Dynamic Density Clustering Algorithm and Stream Computing

This paper proposes a novel multi-radius density clustering algorithm based on outlier factor. The algorithm first calculates the density-similar-neighbor-based outlier factor (DSNOF) for each point in the dataset according to the relationship of the density of the point and its neighbors, and then treats the point whose DSNOF is smaller than 1 as a core point. Second, the core points are used for clustering by the similar process of the density based spatial clustering application with noise (DBSCAN) to get some sub-clusters. Third, the proposed algorithm merges the obtained sub-clusters into some clusters. Finally, the points whose DSNOF are larger than 1 are assigned into these clusters. Experiments are performed on some real datasets of the UCI Machine Learning Repository and the experiments results verify that the effectiveness of the proposed model is higher than the DBSCAN algorithm and k-means algorithm and would not be affected by the parameter greatly.

Download Full-text

Multi-density Clustering Algorithm for Anomaly Detection Using KDD’99 Dataset

Advances in Computing and Communications - Communications in Computer and Information Science ◽

10.1007/978-3-642-22709-7_60 ◽

2011 ◽

pp. 619-630

Author(s):

Santosh Kumar ◽

Sumit Kumar ◽

Sukumar Nandi

Keyword(s):

Anomaly Detection ◽

Clustering Algorithm ◽

Density Clustering

Download Full-text

A novel image segmentation method based on fast density clustering algorithm

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2018.04.023 ◽

2018 ◽

Vol 73 ◽

pp. 92-110 ◽

Cited By ~ 16

Author(s):

Jinyin Chen ◽

Haibin Zheng ◽

Xiang Lin ◽

Yangyang Wu ◽

Mengmeng Su

Keyword(s):

Image Segmentation ◽

Clustering Algorithm ◽

Segmentation Method ◽

Density Clustering

Download Full-text

Density Clustering Algorithm Based on the Dynamic Selection of Cluster Center

2019 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC) ◽

10.1109/cyberc.2019.00050 ◽

2019 ◽

Author(s):

Lulu Sun ◽

Ruilin Zhang

Keyword(s):

Clustering Algorithm ◽

Cluster Center ◽

Dynamic Selection ◽

Density Clustering ◽

Selection Of

Download Full-text

Feature weighted clustering for user profiling

International Journal of Modeling Simulation and Scientific Computing ◽

10.1142/s1793962317500568 ◽

2017 ◽

Vol 08 (04) ◽

pp. 1750056 ◽

Cited By ~ 1

Author(s):

Ayse Cufoglu ◽

Mahi Lohi ◽

Colin Everiss

Keyword(s):

Clustering Algorithm ◽

Personal Information ◽

Clustering Algorithms ◽

User Profile ◽

User Profiling ◽

User Profiles ◽

Related Information ◽

Weighted Clustering ◽

Feature Weight ◽

Feature Values

Personalization is the adaptation of the services to fit the user’s interests, characteristics and needs. The key to effective personalization is user profiling. Apart from traditional collaborative and content-based approaches, a number of classification and clustering algorithms have been used to classify user related information to create user profiles. However, they are not able to achieve accurate user profiles. In this paper, we present a new clustering algorithm, namely Multi-Dimensional Clustering (MDC), to determine user profiling. The MDC is a version of the Instance-Based Learner (IBL) algorithm that assigns weights to feature values and considers these weights for the clustering. Three feature weight methods are proposed for the MDC and, all three, have been tested and evaluated. Simulations were conducted with using two sets of user profile datasets, which are the training (includes 10,000 instances) and test (includes 1000 instances) datasets. These datasets reflect each user’s personal information, preferences and interests. Additional simulations and comparisons with existing weighted and non-weighted instance-based algorithms were carried out in order to demonstrate the performance of proposed algorithm. Experimental results using the user profile datasets demonstrate that the proposed algorithm has better clustering accuracy performance compared to other algorithms. This work is based on the doctoral thesis of the corresponding author.

Download Full-text

Multi-Density Clustering Algorithm Based on Grid Adjacency Relation

2010 Chinese Conference on Pattern Recognition (CCPR) ◽

10.1109/ccpr.2010.5659206 ◽

2010 ◽

Author(s):

Guang-xing Li ◽

Yan Yang

Keyword(s):

Clustering Algorithm ◽

Adjacency Relation ◽

Density Clustering

Download Full-text

Multi-Density Clustering Algorithm Based on Grid and Boundary

2010 International Conference on Computational Intelligence and Software Engineering ◽

10.1109/cise.2010.5676950 ◽

2010 ◽

Author(s):

Yazhou Wang ◽

Wei Wang

Keyword(s):

Clustering Algorithm ◽

Density Clustering

Download Full-text

Scalable Varied Density Clustering Algorithm for Large Datasets

Journal of Software Engineering and Applications ◽

10.4236/jsea.2010.36069 ◽

2010 ◽

Vol 03 (06) ◽

pp. 593-602 ◽

Cited By ~ 3

Author(s):

Ahmed Fahim ◽

Abd-Elbadeeh Salem ◽

Fawzy Torkey ◽

Mohamed Ramadan ◽

Gunter Saake

Keyword(s):

Clustering Algorithm ◽

Large Datasets ◽

Density Clustering

Download Full-text

Scalable Clustering with Supervised Linkage Methods

10.1101/2021.08.01.454697 ◽

2021 ◽

Author(s):

James Anibal ◽

Alexandre Day ◽

Erol Bahadiroglu ◽

Liam O'Neill ◽

Long Phan ◽

...

Keyword(s):

Single Cell ◽

Clustering Algorithm ◽

Clustering Algorithms ◽

Biomedical Sciences ◽

New Approach ◽

Scalable Clustering ◽

Linkage Methods ◽

Density Clustering ◽

Cell Data ◽

Different Levels

Data clustering plays a significant role in biomedical sciences, particularly in single-cell data analysis. Researchers use clustering algorithms to group individual cells into populations that can be evaluated across different levels of disease progression, drug response, and other clinical statuses. In many cases, multiple sets of clusters must be generated to assess varying levels of cluster specificity. For example, there are many subtypes of leukocytes (e.g. T cells), whose individual preponderance and phenotype must be assessed for statistical/functional significance. In this report, we introduce a novel hierarchical density clustering algorithm (HAL-x) that uses supervised linkage methods to build a cluster hierarchy on raw single-cell data. With this new approach, HAL-x can quickly predict multiple sets of labels for immense datasets, achieving a considerable improvement in computational efficiency on large datasets compared to existing methods. We also show that cell clusters generated by HAL-x yield near-perfect F1-scores when classifying different clinical statuses based on single-cell profiles. Our hierarchical density clustering algorithm achieves high accuracy in single cell classification in a scalable, tunable and rapid manner. We make HAL-x publicly available at: https://pypi.org/project/hal-x/

Download Full-text