DGFCM: A New Dynamic Clustering Algorithm

Clustering is an important topic to find relevant content from a document collection and it also reduces the search space. The current clustering research emphasizes the development of a more efficient clustering method without considering the domain knowledge and user’s need. In recent years the semantics of documents have been utilized in document clustering. The discussed work focuses on the clustering model where ontology approach is applied. The major challenge is to use the background knowledge in the similarity measure. This paper presents an ontology based annotation of documents and clustering system. The semi-automatic document annotation and concept weighting scheme is used to create an ontology based knowledge base. The Particle Swarm Optimization (PSO) clustering algorithm can be applied to obtain the clustering solution. The accuracy of clustering has been computed before and after combining ontology with Vector Space Model (VSM). The proposed ontology based framework gives improved performance and better clustering compared to the traditional vector space model. The result using ontology was significant and promising.

Download Full-text

A Sequence Clustering Algorithm for Detecting Software Vulnerabilities Based on Vector Space Model

INTERNATIONAL JOURNAL ON Advances in Information Sciences and Service Sciences ◽

10.4156/aiss.vol4.issue16.30 ◽

2012 ◽

Vol 4 (16) ◽

pp. 258-264

Author(s):

Yanyan WANG ◽

Yanning WANG ◽

Jiadong REN

Keyword(s):

Vector Space ◽

Clustering Algorithm ◽

Vector Space Model ◽

Space Model ◽

Software Vulnerabilities ◽

Sequence Clustering

Download Full-text

A Text Categorization Method Based on SVM and Improved K-Means

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.427-429.2449 ◽

2013 ◽

Vol 427-429 ◽

pp. 2449-2453

Author(s):

Rong Ze Xia ◽

Yan Jia ◽

Hu Li

Keyword(s):

Support Vector Machine ◽

Vector Space ◽

High Performance ◽

Supervised Classification ◽

Text Categorization ◽

Clustering Algorithm ◽

Vector Space Model ◽

Classification Method ◽

Support Vector ◽

Space Model

Traditional supervised classification method such as support vector machine (SVM) could achieve high performance in text categorization. However, we should first hand-labeled the samples before classifying. Its a time-consuming task. Unsupervised method such as k-means could also be used for handling the text categorization problem. However, Traditional k-means could easily be affected by several isolated observations. In this paper, we proposed a new text categorization method. First we improved the traditional k-means clustering algorithm. The improved k-means is used for clustering vectors in our vector space model. After that, we use the SVM to categorize vectors which are preprocessed by improved k-means. The experiments show that our algorithm could out-perform the traditional SVM text categorization method.

Download Full-text

Real-Time Topic Detection with Dynamic Windows

The Computer Journal ◽

10.1093/comjnl/bxz042 ◽

2019 ◽

Vol 63 (3) ◽

pp. 469-478

Author(s):

Na Su ◽

Shujuan Ji ◽

Jimin Liu

Keyword(s):

Data Analysis ◽

Real Time ◽

Data Stream ◽

Clustering Algorithm ◽

Vector Space Model ◽

Topic Detection ◽

Dynamic Clustering ◽

Improve Performance ◽

Space Model ◽

Frequent Items

Abstract Microblog is a popular social network in which hot topics propagate online rapidly. Real-time topic detection can not only understand public opinion well but also bring high commercial value. We design a method for real-time microblog data analysis in order to detect popular long lasting events as well as emerging events. Firstly, a mining frequent items algorithm on microblog data stream is proposed to count approximate word frequency. This mining frequent items algorithm can find the frequent words for some time. Secondly, the windows size of the monitored words is adjusted dynamically according to the duration time and the evolution of events. Lastly, new topics and trends of existing topics can be detected by using dynamic clustering algorithm based on vector space model. Experimental results show that the proposed algorithms can improve performance in terms of running time and accuracy.

Download Full-text

The use of fuzzy clustering algorithm and self-organizing neural networks for identifying potentially failing banks: an experimental study

Expert Systems with Applications ◽

10.1016/s0957-4174(99)00061-5 ◽

2000 ◽

Vol 18 (3) ◽

pp. 185-199 ◽

Cited By ~ 69

Author(s):

P. Alam ◽

D. Booth ◽

K. Lee ◽

T. Thordarson

Keyword(s):

Neural Networks ◽

Experimental Study ◽

Fuzzy Clustering ◽

Clustering Algorithm ◽

Fuzzy Clustering Algorithm ◽

Self Organizing

Download Full-text

An Ontology Based Model for Document Clustering

Organizational Efficiency through Intelligent Information Technologies ◽

10.4018/978-1-4666-2047-6.ch013 ◽

2012 ◽

pp. 199-215

Author(s):

Sridevi U. K. ◽

Nagaveni N.

Keyword(s):

Vector Space ◽

Domain Knowledge ◽

Clustering Algorithm ◽

Document Clustering ◽

Vector Space Model ◽

Search Space ◽

Space Model ◽

Before And After ◽

Document Collection ◽

Improved Performance

Clustering is an important topic to find relevant content from a document collection and it also reduces the search space. The current clustering research emphasizes the development of a more efficient clustering method without considering the domain knowledge and user’s need. In recent years the semantics of documents have been utilized in document clustering. The discussed work focuses on the clustering model where ontology approach is applied. The major challenge is to use the background knowledge in the similarity measure. This paper presents an ontology based annotation of documents and clustering system. The semi-automatic document annotation and concept weighting scheme is used to create an ontology based knowledge base. The Particle Swarm Optimization (PSO) clustering algorithm can be applied to obtain the clustering solution. The accuracy of clustering has been computed before and after combining ontology with Vector Space Model (VSM). The proposed ontology based framework gives improved performance and better clustering compared to the traditional vector space model. The result using ontology was significant and promising.

Download Full-text