Semantic Similarity-Based Clustering of Web Documents Using Fuzzy C-Means
With the massive growth and large volume of the web it is very difficult to recover results based on the user preferences. The next generation web architecture, semantic web reduces the burden of the user by performing search based on semantics instead of keywords. Even in the context of semantic technologies optimization problem occurs but rarely considered. In this paper, document clustering is applied to recover relevant documents. We propose an ontology-based clustering algorithm using semantic similarity measure and Fuzzy C-Means, which is applied to the annotated documents for optimizing the result. The proposed method uses Jena API and GATE tool API and the documents can be recovered based on their annotation features and relations. A preliminary experiment comparing the proposed method with K-Means, PSO and hybrid approach PSOK Means shows that the proposed method is feasible and performs better than other clustering methods.