scholarly journals STRICT: Information retrieval based search term identification for concept location

Author(s):  
Mohammad Masudur Rahman ◽  
Chanchal K. Roy
2013 ◽  
Vol 07 (04) ◽  
pp. 407-426
Author(s):  
TUUKKA RUOTSALO ◽  
MATIAS FROSTERUS

Structured Web data are increasingly accessed using information retrieval methods and information retrieval increasingly relies on structured background knowledge. As users' searches are often directed towards finding information about entities rather than text documents, a key affordance of semantic search is the ability to retrieve relevant information about entities more precisely by utilizing the rich structured descriptions and background knowledge. Entity search also poses challenges for information retrieval methods. Entity descriptions are often short and conventional search term matching alone can be insufficient. As a consequence, the search engine should be able to increase the recall of the returned results and select a representative set of entities for a user; to diversify search results. This paper presents an approach to diversify entity search by using semantics present and inferred from the initial entity search results. Our approach utilizes ontologies as a source of background knowledge to improve recall of entity retrieval and independent component analysis to detect independent latent components shared by the entities. The search results are then diversified by selecting a representative set of entities based on their membership in the independent components. We demonstrate the performance of our approach through retrieval experiments conducted by using a real-world dataset composed from four entity databases. The results suggest that our approach can significantly improve effectiveness and diversity of entity search.


2019 ◽  
Vol 8 (3) ◽  
pp. 6371-6375

The innovation of web produced a huge of information, evaluates by empowering Internet users to post their assessments, remarks, and audits on the web. Preprocessing helps to understand a user query in the Information Retrieval (IR) system. IR acts as the container to representation, seeking and access information that relates to a user search string. The information is present in natural language by using some words; it’s not structured format, and sometimes that word often ambiguous. One of the major challenges determines in current web search vocabulary mismatch problem during the preprocessing. In an IR system determine a drawback in web search; the search query string is that the relationships between the query expressions and the expanded terms are limited. The query expressions relate to search term fetching information from the IR. The expanded terms by adding those terms that is most similar to the words of the search string. In this manuscript, we mainly focus on behind user’s search string on the web. We identify the best features within this context for term selection in supervised learning based model. In this proposed system the main focus of preprocessing techniques like Tokenization, Stemming, spell check, find dissimilar words and discover the keywords from the user query because provide better results for the user


Sign in / Sign up

Export Citation Format

Share Document