A novel text clustering algorithm treated attributes differently

2015 ◽  
pp. 731-734
2010 ◽  
Vol 30 (7) ◽  
pp. 1933-1935 ◽  
Author(s):  
Wen-ming ZHANG ◽  
Jiang WU ◽  
Xiao-jiao YUAN

2014 ◽  
Vol 678 ◽  
pp. 19-22
Author(s):  
Hong Xin Wan ◽  
Yun Peng

Web text exists non-certain and non-structure contents ,and it is difficult to cluster the text by normal classification methods. We propose a web text clustering algorithm based on fuzzy set to increase the computing accuracy with the web text. After abstracting the key words of the text, we can look it as attributes and design the fuzzy algorithm to decide the membership of the words. The algorithm can improve the algorithm complexity of time and space, increase the robustness comparing to the normal algorithm. To test the accuracy and efficiency of the algorithm, we take the comparative experiment between pattern clustering and our algorithm. The experiment shows that our method has a better result.


2012 ◽  
Vol 532-533 ◽  
pp. 1716-1720 ◽  
Author(s):  
Chun Xia Jin ◽  
Hai Yan Zhou ◽  
Qiu Chan Bai

To solve the problem of sparse keywords and similarity drift in short text segments, this paper proposes short text clustering algorithm with feature keyword expansion (STCAFKE). The method can realize short text clustering by expanding feature keyword based on HowNet and combining K-means algorithm and density algorithm. It may add the number of text keyword with feature keyword expansion and increase text semantic features to realize short text clustering. Experimental results show that this algorithm has increased the short text clustering quality on precision and recall.


2011 ◽  
Vol 135-136 ◽  
pp. 1155-1158
Author(s):  
Wei Li ◽  
Mei An Li

Based on the probability model of clustering algorithm constructs a model for each cluster, calculate probability of every text falls in different models to decide text belongs to which cluster, conveniently in global Angle represents abstract structure of clusters. In this paper combining the hidden Markov model and k - means clustering algorithm realize text clustering, first produces first clustering results by k - means algorithm, as the initial probability model of a hidden Markov model ,constructed probability transfer matrix prediction every step of clustering iteration, when subtraction value of two probability transfer matrix is 0, clustering end. This algorithm can in global perspective every cluster of document clustering process, to avoid the repetition of clustering process, effectively improve the clustering algorithm .


Sign in / Sign up

Export Citation Format

Share Document