Clustering Chinese Web Search Results Based on Association Calculation

2011 ◽  
Vol 55-57 ◽  
pp. 1418-1423
Author(s):  
Ying Zhao ◽  
Ya Jun Du ◽  
Qiang Qiang Peng

Clustering web search results is a kind of solution which help user to find the interested topic by grouping the search results. This paper presents an improved method for clustering search results focused on Chinese web pages. The main contributions of this paper are the following: First, in this paper, a method which identifies the complete semantic information phrase by comparing the attributes of base clusters in the suffix tree document model and the overlap of their document sets is presented. Second, by analyzing the content and structure of title and snippet of Chinese web search results, one way of sentence segmentation is designed and implemented to constructing suffix tree. Third, In order to better respond to the associate degree of terms, a novel method is proposed which compute the distance in sentence-grain of terms' co-occurrences. Finally, the experiment illustrates that the new clustering method provides an efficient and effective way for user browsing and locating sought information.

Author(s):  
Constanta-Nicoleta Bodea ◽  
Maria-Iuliana Dascalu ◽  
Adina Lipai

This chapter presents a meta-search approach, meant to deliver bibliography from the internet, according to trainees’ results obtained at an e-assessment task. The bibliography consists of web pages related to the knowledge gaps of the trainees. The meta-search engine is part of an education recommender system, attached to an e-assessment application for project management knowledge. Meta-search means that, for a specific query (or mistake made by the trainee), several search mechanisms for suitable bibliography (further reading) could be applied. The lists of results delivered by the standard search mechanisms are used to build thematically homogenous groups using an ontology-based clustering algorithm. The clustering process uses an educational ontology and WordNet lexical database to create its categories. The research is presented in the context of recommender systems and their various applications to the education domain.


Author(s):  
Yan Chen ◽  
Yan-Qing Zhang

For most Web searching applications, queries are commonly ambiguous because words or phrases have different linguistic meanings for different Web users. The conventional keyword-based search engines cannot disambiguate queries to provide relevant results matching Web users’ intents. Traditional Word Sense Disambiguation (WSD) methods use statistic models or ontology-based knowledge systems to measure associations among words. The contexts of queries are used for disambiguation in these methods. However, due to the fact that numerous combinations of words may appear in queries and documents, it is difficult to extract concepts’ relations for all possible combinations. Moreover, queries are usually short, so contexts in queries do not always provide enough information to disambiguate queries. Therefore, the traditional WSD methods are not sufficient to provide accurate search results for ambiguous queries. In this chapter, a new model, Granular Semantic Tree (GST), is introduced for more conveniently representing associations among concepts than the traditional WSD methods. Additionally, users’ preferences are used to provide personalized search results that better adapt to users’ unique intents. Fuzzy logic is used to determine the most appropriate concepts related to queries based on contexts and users’ preferences. Finally, Web pages are analyzed by the GST model. The concepts of pages for the queries are evaluated, and the pages are re-ranked according to similarities of concepts between pages and queries.


2018 ◽  
Vol 11 (2) ◽  
pp. 110-127 ◽  
Author(s):  
Suruchi Chawla

The main challenge to effective information retrieval is to optimize the page ranking in order to retrieve relevant documents for user queries. In this article, a method is proposed which uses hybrid of genetic algorithms (GA) and trust for generating the optimal ranking of trusted clicked URLs for web page recommendations. The trusted web pages are selected based on clustered query sessions for GA based optimal ranking in order to retrieve more relevant documents up in ranking and improves the precision of search results. Thus, the optimal ranking of trusted clicked URLs recommends relevant documents to web users for their search goal and satisfy the information need of the user effectively. The experiment was conducted on a data set captured in three domains, academics, entertainment and sports, to evaluate the performance of GA based optimal ranking (with/without trust) and search results confirms the improvement of precision of search results.


Author(s):  
Suruchi Chawla

The main challenge to effective information retrieval is to optimize the page ranking in order to retrieve relevant documents for user queries. In this article, a method is proposed which uses hybrid of genetic algorithms (GA) and trust for generating the optimal ranking of trusted clicked URLs for web page recommendations. The trusted web pages are selected based on clustered query sessions for GA based optimal ranking in order to retrieve more relevant documents up in ranking and improves the precision of search results. Thus, the optimal ranking of trusted clicked URLs recommends relevant documents to web users for their search goal and satisfy the information need of the user effectively. The experiment was conducted on a data set captured in three domains, academics, entertainment and sports, to evaluate the performance of GA based optimal ranking (with/without trust) and search results confirms the improvement of precision of search results.


Sign in / Sign up

Export Citation Format

Share Document