scholarly journals Privacy-preserving constrained spectral clustering algorithm for large-scale data sets

2020 ◽  
Vol 14 (3) ◽  
pp. 321-331 ◽  
Author(s):  
Ji Li ◽  
Jianghong Wei ◽  
Mao Ye ◽  
Wenfen Liu ◽  
Xuexian Hu
2017 ◽  
Vol 135 ◽  
pp. 77-88 ◽  
Author(s):  
Wenfen Liu ◽  
Mao Ye ◽  
Jianghong Wei ◽  
Xuexian Hu

Author(s):  
Hao Liu ◽  
◽  
Satoshi Oyama ◽  
Masahito Kurihara ◽  
Haruhiko Sato

Clustering is an important tool for data analysis and many clustering techniques have been proposed over the past years. Among them are density-based clustering methods, which have several benefits such as the number of clusters is not required before carrying out clustering; the detected clusters can be represented in an arbitrary shape and outliers can be detected and removed. Recently, the density-based algorithms were extended with the fuzzy set theory, which has made these algorithm more robust. However, the density-based clustering algorithms usually require a time complexity ofO(n2) wherenis the number of data in the data set, implying that they are not suitable to work with large scale data sets. In this paper, a novel clustering algorithm called landmark fuzzy neighborhood DBSCAN (landmark FN-DBSCAN) is proposed. The concept, landmark, is used to represent a subset of the input data set which makes the algorithm efficient on large scale data sets. We give a theoretical analysis on time complexity and space complexity, which shows both of them are linear to the size of the data set. The experiments show that the landmark FN-DBSCAN is much faster than FN-DBSCAN and provides a very good quality of clustering.


Author(s):  
Vo Ngoc Phu ◽  
Vo Thi Ngoc Tran

Artificial intelligence (ARTINT) and information have been famous fields for many years. A reason has been that many different areas have been promoted quickly based on the ARTINT and information, and they have created many significant values for many years. These crucial values have certainly been used more and more for many economies of the countries in the world, other sciences, companies, organizations, etc. Many massive corporations, big organizations, etc. have been established rapidly because these economies have been developed in the strongest way. Unsurprisingly, lots of information and large-scale data sets have been created clearly from these corporations, organizations, etc. This has been the major challenges for many commercial applications, studies, etc. to process and store them successfully. To handle this problem, many algorithms have been proposed for processing these big data sets.


2017 ◽  
Author(s):  
Shirley M. Matteson ◽  
Sonya E. Sherrod ◽  
Sevket Ceyhun Cetin

Sign in / Sign up

Export Citation Format

Share Document