D3CAS: Distributed Clustering Algorithm Applied to Short-Text Stream Processing

Author(s):  
Roberto Molina ◽  
Waldo Hasperué ◽  
Augusto Villa Monte
IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 45439-45447 ◽  
Author(s):  
Jun-Taek Kong ◽  
Do-Chang Ahn ◽  
Seong-Eun Kim ◽  
Woo-Jin Song

2014 ◽  
Vol 687-691 ◽  
pp. 1496-1499
Author(s):  
Yong Lin Leng

Partially missing or blurring attribute values make data become incomplete during collecting data. Generally we use inputation or discarding method to deal with incomplete data before clustering. In this paper we proposed an a new similarity metrics algorithm based on incomplete information system. First algorithm divided the data set into a complete data set and non complete data set, and then the complete data set was clustered using the affinity propagation clustering algorithm, incomplete data according to the design method of the similarity metric is divided into the corresponding cluster. In order to improve the efficiency of the algorithm, designing the distributed clustering algorithm based on cloud computing technology. Experiment demonstrates the proposed algorithm can cluster the incomplete big data directly and improve the accuracy and effectively.


2012 ◽  
Vol 532-533 ◽  
pp. 1716-1720 ◽  
Author(s):  
Chun Xia Jin ◽  
Hai Yan Zhou ◽  
Qiu Chan Bai

To solve the problem of sparse keywords and similarity drift in short text segments, this paper proposes short text clustering algorithm with feature keyword expansion (STCAFKE). The method can realize short text clustering by expanding feature keyword based on HowNet and combining K-means algorithm and density algorithm. It may add the number of text keyword with feature keyword expansion and increase text semantic features to realize short text clustering. Experimental results show that this algorithm has increased the short text clustering quality on precision and recall.


Sign in / Sign up

Export Citation Format

Share Document