Document Clustering Using an Ontology-Based Vector Space Model
This paper introduces a novel conceptual framework to support the creation of knowledge representations based on enriched Semantic Vectors, using the classical vector space model approach extended with ontological support. One of the primary research challenges addressed here relates to the process of formalization and representation of document contents, where most existing approaches are limited and only take into account the explicit, word-based information in the document. This research explores how traditional knowledge representations can be enriched through incorporation of implicit information derived from the complex relationships (semantic associations) modelled by domain ontologies with the addition of information presented in documents. The relevant achievements pursued by this work are the following: (i) conceptualization of a model that enables the semantic enrichment of knowledge sources supported by domain experts; and (ii) implementation of a proof-of-concept, named SENSE (Semantic Enrichment knowledge Sources).