scholarly journals Analysis and development of latent semantic indexing techniques for information retrieval

Author(s):  
M. Bottello
2012 ◽  
Vol 12 (1) ◽  
pp. 34-48 ◽  
Author(s):  
Ch. Aswani Kumar ◽  
M. Radvansky ◽  
J. Annapurna

Abstract Latent Semantic Indexing (LSI), a variant of classical Vector Space Model (VSM), is an Information Retrieval (IR) model that attempts to capture the latent semantic relationship between the data items. Mathematical lattices, under the framework of Formal Concept Analysis (FCA), represent conceptual hierarchies in data and retrieve the information. However, both LSI and FCA use the data represented in the form of matrices. The objective of this paper is to systematically analyze VSM, LSI and FCA for the task of IR using standard and real life datasets.


Author(s):  
Anne Kao

Latent Semantic Analysis (LSA) or Latent Semantic Indexing (LSI), when applied to information retrieval, has been a major analysis approach in text mining. It is an extension of the vector space method in information retrieval, representing documents as numerical vectors but using a more sophisticated mathematical approach to characterize the essential features of the documents and reduce the number of features in the search space. This chapter summarizes several major approaches to this dimensionality reduction, each of which has strengths and weaknesses, and it describes recent breakthroughs and advances. It shows how the constructs and products of LSA applications can be made user-interpretable and reviews applications of LSA beyond information retrieval, in particular, to text information visualization.


2006 ◽  
Vol 05 (02) ◽  
pp. 97-105 ◽  
Author(s):  
S. Srinivas ◽  
Ch. AswaniKumar

Latent Semantic Indexing (LSI) is a famous Information Retrieval (IR) technique that tries to overcome the problems of lexical matching using conceptual indexing. LSI is a variant of vector space model and proved to be 30% more effective. Many studies have reported that good retrieval performance is related to the use of various retrieval heuristics. In this paper, we focus on optimising two LSI retrieval heuristics: term weighting and rank approximation. The results obtained demonstrate that the LSI performance improves significantly with the combination of optimised term weighting and rank approximation.


2005 ◽  
Vol 04 (04) ◽  
pp. 279-285 ◽  
Author(s):  
Ch. AswaniKumar ◽  
Ankush Gupta ◽  
Mahmooda Batool ◽  
Shagun Trehan

The primary goal of an information retrieval system is to retrieve all the documents that are relevant to the user query. Disparities between the vocabulary of the system's authors and that of their users pose difficulties when information is processed without human intervention. Preprocessing the documents and user queries using intelligence techniques to remove the ambiguities in representation and indexing is the current area of research. In this paper, we present a novel intelligent method that has been appended to existing stemming and stopword removal processes. We designed an information retrieval system based on the proposed method using latent semantic indexing. The experimental results of the system using the proposed method exhibits the superiority over other systems based on traditional preprocessing methods.


Sign in / Sign up

Export Citation Format

Share Document