scholarly journals A Fast Multiscale Clustering Approach Based on DBSCAN

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Runzi Chen ◽  
Shuliang Zhao ◽  
Meishe Liang

Multiscale brings great benefits for people to observe objects or problems from different perspectives. It has practical significance for clustering on multiscale data. At present, there is a lack of research on the clustering of large-scale data under the premise that clustering results of small-scale datasets have been obtained. If one does cluster on large-scale datasets by using traditional methods, two disadvantages are as follows: (1) Clustering results of small-scale datasets are not utilized. (2) Traditional method will cause more running overhead. Aims at these shortcomings, this paper proposes a multiscale clustering framework based on DBSCAN. This framework uses DBSCAN for clustering small-scale datasets, then introduces algorithm Scaling-Up Cluster Centers (SUCC) generating cluster centers of large-scale datasets by merging clustering results of small-scale datasets, not mining raw large-scale datasets. We show experimentally that, compared to traditional algorithm DBACAN and leading algorithms DBSCAN++ and HDBSCAN, SUCC can provide not only competitive performance but reduce computational cost. In addition, under the guidance of experts, the performance of SUCC is more competitive in accuracy.

Author(s):  
De-Ming Liang ◽  
Yu-Feng Li

Label propagation spreads the soft labels from few labeled data to a large amount of unlabeled data according to the intrinsic graph structure. Nonetheless, most label propagation solutions work under relatively small-scale data and fail to cope with many real applications, such as social network analysis, where graphs usually have millions of nodes. In this paper, we propose a novel algorithm named \algo to deal with large-scale data. A lightweight iterative process derived from the well-known stochastic gradient descent strategy is used to reduce memory overhead and accelerate the solving process. We also give a theoretical analysis on the necessity of the warm-start technique for label propagation. Experiments show that our algorithm can handle million-scale graphs in few seconds while achieving highly competitive performance with existing algorithms.


2020 ◽  
pp. 1-11
Author(s):  
Jingwen Hou

At present, online education evaluation models are insufficient when dealing with small-scale evaluation data sets. In order to discriminate the learner’s learning state, this paper further studies online teaching machine learning methods, and introduces adaptive learning rate and momentum terms to improve the gradient descent method of BP neural network to improve the convergence rate of the model. Moreover, this study proposes a deep neural network model to deal with complex high-dimensional large-scale data set problems. In the process of supervised prediction, this study uses support vector regression as a predictor for supervised prediction, and this study maps complex non-linear relationships into high-dimensional space to achieve a linear relationship similar to low-dimensional space. In addition, in this study, small-scale teaching quality evaluation data sets and large-scale data sets are input into the model to perform experiments. Finally, the model proposed in this study is compared with other shallow models. The results show that the model proposed in this research is effective and advantageous in evaluating teaching quality in universities and processing large-scale data sets.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Yue Hu ◽  
Ge Peng ◽  
Zehua Wang ◽  
Yanrong Cui ◽  
Hang Qin

For the data processing with increasing avalanche under large datasets, the k nearest neighbors (KNN) algorithm is a particularly expensive operation for both classification and regression predictive problems. To predict the values of new data points, it can calculate the feature similarity between each object in the test dataset and each object in the training dataset. However, due to expensive computational cost, the single computer is out of work to deal with large-scale dataset. In this paper, we propose an adaptive vKNN algorithm, which adopts on the Voronoi diagram under the MapReduce parallel framework and makes full use of the advantages of parallel computing in processing large-scale data. In the process of partition selection, we design a new predictive strategy for sample point to find the optimal relevant partition. Then, we can effectively collect irrelevant data, reduce KNN join computation, and improve the operation efficiency. Finally, we use a large number of 54-dimensional datasets to conduct a large number of experiments on the cluster. The experimental results show that our proposed method is effective and scalable with ensuring accuracy.


2020 ◽  
Vol 10 (1) ◽  
pp. 23-27
Author(s):  
Fajar Awang Irawan ◽  
Oktavia Pratiwi Diah Ayu Pangesti

Siswa tingkat sekolah dasar mengalami kesulitan dalam bermain petanque karena bola besi terasa berat saat dimainkan. Tujuan penelitian ini untuk memberikan alternative media permainan melalui modifikasi bola petanque yang lebih murah, ringan, praktis, dan menarik tetapi tetap standar. Penelitian ini menggunakan 10 orang dalam uji coba kecil dan 20 orang dalam uji coba besar dengan rentang usia 8 sampai 10 tahun. Hasil dari penelitian ini pada uji coba skala kecil diperoleh 84% spesifikasi produk pada kategori baik dan 90% relevansi produk juga pada kriteria baik. Untuk uji coba skala besar diperoleh 88% spesifikasi pada kategori baik, sedangkan relevansi produk 92% pada kategori baik pula. Hasil lain pada komentar ahli ditemukan 94.15% untuk kualitas produk Bokavia pada kategori sangat baik dan 96.65% untuk spesifikasi produk juga pada kategori sangat baik.   Kesimpulan yang didapat bahwa Bokavia ini layak digunakan untuk bermain petanque dan juga sebagai alternative pilihan dalam pembelajaran.Elementary student felt difficult when playing petanque because of the weight. The aim of this study was to give media alternative using petanque ball modification that more cheap, light, and interesting as petanque standard. This study using 10 participants in small-scale and 20 participants in large-scale with the mean age 8 to 10 years old. Result of this study found that 84% product specifications in small-scale was good and 90% product relevance was also in good category. Large-scale data found 88% product specification in 88% and 92% relevance product was good in category. Results from the expertise found that 94.15% for product quality was very good and 96.65% product specification was also very good. The conclusion stated that Bokavia suitable for was petanque and also as an alternative choice for learning.


Oecologia ◽  
2005 ◽  
Vol 145 (2) ◽  
pp. 176-177 ◽  
Author(s):  
N. Underwood ◽  
P. Hambäck ◽  
B. D. Inouye

2009 ◽  
Vol 28 (11) ◽  
pp. 2737-2740
Author(s):  
Xiao ZHANG ◽  
Shan WANG ◽  
Na LIAN

Sign in / Sign up

Export Citation Format

Share Document