EXPANDING APPROACH TO INFORMATION RETRIEVAL USING SEMANTIC SIMILARITY ANALYSIS BASED ON WORDNET AND WIKIPEDIA
Performance of information retrieval (IR) systems greatly relies on textual keywords and retrieval documents. Inaccurate and incomplete retrieval results are always induced by query drift and ignorance of semantic relationship among terms. Expanding retrieval approach attempts to incorporate expansion terms into original query, such as unexplored words combing from pseudo-relevance feedback (PRF) or relevance feedback documents semantic words extracting from external corpus etc. In this paper a semantic analysis-based query expansion method for information retrieval using WordNet and Wikipedia as corpus are proposed. We derive semantic-related words from human knowledge repositories such as WordNet and Wikipedia, which are combined with words filtered by semantic mining from PRF document. Our approach automatically generates new semantic-based query from original query of IR. Experimental results on TREC datasets and Google search engine show that performance of information retrieval can be significantly improved using proposed method over previous results.