Web Page Recommender System using hybrid of Genetic Algorithm and Trust for Personalized Web Search

2018 ◽  
Vol 11 (2) ◽  
pp. 110-127 ◽  
Author(s):  
Suruchi Chawla

The main challenge to effective information retrieval is to optimize the page ranking in order to retrieve relevant documents for user queries. In this article, a method is proposed which uses hybrid of genetic algorithms (GA) and trust for generating the optimal ranking of trusted clicked URLs for web page recommendations. The trusted web pages are selected based on clustered query sessions for GA based optimal ranking in order to retrieve more relevant documents up in ranking and improves the precision of search results. Thus, the optimal ranking of trusted clicked URLs recommends relevant documents to web users for their search goal and satisfy the information need of the user effectively. The experiment was conducted on a data set captured in three domains, academics, entertainment and sports, to evaluate the performance of GA based optimal ranking (with/without trust) and search results confirms the improvement of precision of search results.

Author(s):  
Suruchi Chawla

The main challenge to effective information retrieval is to optimize the page ranking in order to retrieve relevant documents for user queries. In this article, a method is proposed which uses hybrid of genetic algorithms (GA) and trust for generating the optimal ranking of trusted clicked URLs for web page recommendations. The trusted web pages are selected based on clustered query sessions for GA based optimal ranking in order to retrieve more relevant documents up in ranking and improves the precision of search results. Thus, the optimal ranking of trusted clicked URLs recommends relevant documents to web users for their search goal and satisfy the information need of the user effectively. The experiment was conducted on a data set captured in three domains, academics, entertainment and sports, to evaluate the performance of GA based optimal ranking (with/without trust) and search results confirms the improvement of precision of search results.


2017 ◽  
Vol 10 (2) ◽  
pp. 311-325
Author(s):  
Suruchi Chawla

The main challenge for effective web Information Retrieval(IR) is to infer the information need from user’s query and retrieve relevant documents. The precision of search results is low due to vague and imprecise user queries and hence could not retrieve sufficient relevant documents. Fuzzy set based query expansion deals with imprecise and vague queries for inferring user’s information need. Trust based web page recommendations retrieve search results according to the user’s information need. In this paper an algorithm is designed for Intelligent Information Retrieval using hybrid of Fuzzy set and Trust in web query session mining to perform Fuzzy query expansion for inferring user’s information need and trust is used for recommendation of web pages according to the user’s information need. Experiment was performed on the data set collected in domains Academics, Entertainment and Sports and search results confirm the improvement of precision.


2016 ◽  
Vol 7 (1) ◽  
pp. 33-49 ◽  
Author(s):  
Suruchi Chawla

In this paper novel method is proposed using hybrid of Genetic Algorithm (GA) and Back Propagation (BP) Artificial Neural Network (ANN) for learning of classification of user queries to cluster for effective Personalized Web Search. The GA- BP ANN has been trained offline for classification of input queries and user query session profiles to a specific cluster based on clustered web query sessions. Thus during online web search, trained GA –BP ANN is used for classification of new user queries to a cluster and the selected cluster is used for web page recommendations. This process of classification and recommendations continues till search is effectively personalized to the information need of the user. Experiment was conducted on the data set of web user query sessions to evaluate the effectiveness of Personalized Web Search using GA optimized BP ANN and the results confirm the improvement in the precision of search results.


2021 ◽  
Author(s):  
◽  
Daniel Wayne Crabtree

<p>This thesis investigates the refinement of web search results with a special focus on the use of clustering and the role of queries. It presents a collection of new methods for evaluating clustering methods, performing clustering effectively, and for performing query refinement. The thesis identifies different types of query, the situations where refinement is necessary, and the factors affecting search difficulty. It then analyses hard searches and argues that many of them fail because users and search engines have different query models. The thesis identifies best practice for evaluating web search results and search refinement methods. It finds that none of the commonly used evaluation measures for clustering meet all of the properties of good evaluation measures. It then presents new quality and coverage measures that satisfy all the desired properties and that rank clusterings correctly in all web page clustering situations. The thesis argues that current web page clustering methods work well when different interpretations of the query have distinct vocabulary, but still have several limitations and often produce incomprehensible clusters. It then presents a new clustering method that uses the query to guide the construction of semantically meaningful clusters. The new clustering method significantly improves performance. Finally, the thesis explores how searches and queries are composed of different aspects and shows how to use aspects to reduce the distance between the query models of search engines and users. It then presents fully automatic methods that identify query aspects, identify underrepresented aspects, and predict query difficulty. Used in combination, these methods have many applications — the thesis describes methods for two of them. The first method improves the search results for hard queries with underrepresented aspects by automatically expanding the query using semantically orthogonal keywords related to the underrepresented aspects. The second method helps users refine hard ambiguous queries by identifying the different query interpretations using a clustering of a diverse set of refinements. Both methods significantly outperform existing methods.</p>


2021 ◽  
Author(s):  
◽  
Daniel Wayne Crabtree

<p>This thesis investigates the refinement of web search results with a special focus on the use of clustering and the role of queries. It presents a collection of new methods for evaluating clustering methods, performing clustering effectively, and for performing query refinement. The thesis identifies different types of query, the situations where refinement is necessary, and the factors affecting search difficulty. It then analyses hard searches and argues that many of them fail because users and search engines have different query models. The thesis identifies best practice for evaluating web search results and search refinement methods. It finds that none of the commonly used evaluation measures for clustering meet all of the properties of good evaluation measures. It then presents new quality and coverage measures that satisfy all the desired properties and that rank clusterings correctly in all web page clustering situations. The thesis argues that current web page clustering methods work well when different interpretations of the query have distinct vocabulary, but still have several limitations and often produce incomprehensible clusters. It then presents a new clustering method that uses the query to guide the construction of semantically meaningful clusters. The new clustering method significantly improves performance. Finally, the thesis explores how searches and queries are composed of different aspects and shows how to use aspects to reduce the distance between the query models of search engines and users. It then presents fully automatic methods that identify query aspects, identify underrepresented aspects, and predict query difficulty. Used in combination, these methods have many applications — the thesis describes methods for two of them. The first method improves the search results for hard queries with underrepresented aspects by automatically expanding the query using semantically orthogonal keywords related to the underrepresented aspects. The second method helps users refine hard ambiguous queries by identifying the different query interpretations using a clustering of a diverse set of refinements. Both methods significantly outperform existing methods.</p>


2015 ◽  
Vol 5 (1) ◽  
pp. 41-55 ◽  
Author(s):  
Sutirtha Kumar Guha ◽  
Anirban Kundu ◽  
Rana Duttagupta

In this paper the authors are going to propose a new rank measurement technique by introducing weightage factor based on number of Web links available on a particular Web page. Available Web links are considered as an important importance indicator. Distinct weightage factor is assigned to the Web pages as these are calculated based on the Web links. Different Web pages are evaluated more accurately due to the independent and uniqueness of weightage factor. Better Web page ranking is achieved as it depends on specific weightage factor. Impact of unwanted intruder is minimized by the introduction of this weightage factor.


10.28945/2570 ◽  
2002 ◽  
Author(s):  
Anthony Scime ◽  
Colleen Powderly

A method to create more effective Web search queries is to combine elements of a semantic approach with a template that requests specific details about the searcher’s information need. Fundamental to this process is the use of semantics. Nouns, key phrases, and verbs are scored according to their frequency of use, then ranked as keywords and used to create the query. Key phrases and words in the query accurately represent the concepts of the text, generating search results that are significantly more accurate than those available using current methods.


Supplementary factor to the general pagerank calculation which is utilized by Google chrome to rank sites in their web index results is tended to in this paper. These extra factors incorporate couple of ideas which expressly results to build the precision of evaluating the PageRank value. By making a decision about the likeness between the web page content with the text extracted from different site pagesresulted in topmost search using few keywords of the considered page for which the rank is to be determined by utilizing a comparability measure. It results with a worth or rate which speaks to the significance or similarity factor. Further, in a similar strategy if sentimental analysis is applied the search results of the keywords could be analysed with keywords of the page considered, it results with a Sentimental Analysed factor.In this way, one can improve and execute the Page ranking procedure which results with a superior accuracy.Hadoop Distributed File System is used to compute the page rank of input nodes. Python is chosen for parallel page rank algorithm that is executed on Hadoop


Author(s):  
GAURAV AGARWAL ◽  
SACHI GUPTA ◽  
SAURABH MUKHERJEE

Today, web servers, are the key repositories of the information & internet is the source of getting this information. There is a mammoth data on the Internet. It becomes a difficult job to search out the accordant data. Search Engine plays a vital role in searching the accordant data. A search engine follows these steps: Web crawling by crawler, Indexing by Indexer and Searching by Searcher. Web crawler retrieves information of the web pages by following every link on the site. Which is stored by web search engine then the content of the web page is indexed by the indexer. The main role of indexer is how data can be catch soon as per user requirements. As the client gives a query, Search Engine searches the results corresponding to this query to provide excellent output. Here ambition is to enroot an algorithm for search engine which may response most desirable result as per user requirement. In this a ranking method is used by the search engine to rank the web pages. Various ranking approaches are discussed in literature but in this paper, ranking algorithm is proposed which is based on parent-child relationship. Proposed ranking algorithm is based on priority assignment phase of Heterogeneous Earliest Finish Time (HEFT) Algorithm which is designed for multiprocessor task scheduling. Proposed algorithm works on three on range variable its means the density of keywords, number of successors to the nodes and the age of the web page. Density shows the occurrence of the keyword on the particular web page. Numbers of successors represent the outgoing link to a single web page. Age is the freshness value of the web page. The page which is modified recently is the freshest page and having the smallest age or largest freshness value. Proposed Technique requires that the priorities of each page to be set with the downward rank values & pages are arranged in ascending/ Descending order of their rank values. Experiments show that our algorithm is valuable. After the comparison with Google we find that our Algorithm is performing better. For 70% problems our algorithm is working better than Google.


Sign in / Sign up

Export Citation Format

Share Document