CQIG: An Improved Web Search Results Clustering Algorithm

Author(s):  
Yong-gong Ren ◽  
Dan Fan
Author(s):  
R. Subhashini ◽  
V.Jawahar Senthil Kumar

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.


Author(s):  
Supakpong Jinarat ◽  
Choochart Haruechaiyasak ◽  
Arnon Rungsawang

A search engine usually returns a long list of web search results corresponding to a query from the user. Users must spend a lot of time for browsing and navigating the search results for the relevant results. Many research works applied the text clustering techniques, called web search results clustering, to handle the problem. Unfortunately, search result document returned from search engine is a very short text. It is difficult to cluster related documents into the same group because a short document has low informative content. In this paper, we proposed a method to cluster the web search results with high clustering quality using graph-based clustering with concept which extract from the external knowledge source. The main idea is to expand the original search results with some related concept terms. We applied the Wikipedia as the external knowledge source for concept extraction. We compared the clustering results of our proposed method with two well-known search results clustering techniques, Suffix Tree Clustering and Lingo. The experimental results showed that our proposed method significantly outperforms over the well-known clustering techniques.


2011 ◽  
Vol 1 (1) ◽  
pp. 31-44 ◽  
Author(s):  
R. Subhashini ◽  
V.Jawahar Senthil Kumar

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method s feasibility and effectiveness.


Author(s):  
Constanta-Nicoleta Bodea ◽  
Maria-Iuliana Dascalu ◽  
Adina Lipai

This chapter presents a meta-search approach, meant to deliver bibliography from the internet, according to trainees’ results obtained at an e-assessment task. The bibliography consists of web pages related to the knowledge gaps of the trainees. The meta-search engine is part of an education recommender system, attached to an e-assessment application for project management knowledge. Meta-search means that, for a specific query (or mistake made by the trainee), several search mechanisms for suitable bibliography (further reading) could be applied. The lists of results delivered by the standard search mechanisms are used to build thematically homogenous groups using an ontology-based clustering algorithm. The clustering process uses an educational ontology and WordNet lexical database to create its categories. The research is presented in the context of recommender systems and their various applications to the education domain.


2016 ◽  
Vol 9 (1) ◽  
pp. 152
Author(s):  
Burak Omer Saracoglu

Purpose: The electricity demand in Turkey has been increasing for a while. Hydropower is one of the major electricity generation types to compensate this electricity demand in Turkey. Private investors (domestic and foreign) in the hydropower electricity generation sector have been looking for the most appropriate and satisfactory new private hydropower investment (PHPI) options and opportunities in Turkey. This study aims to present a qualitative multi-attribute decision making (MADM) model, that is easy, straightforward, and fast for the selection of the most satisfactory reasonable PHPI options during the very early investment stages (data and information poorness on projects).Design/methodology/approach: The data and information of the PHPI options was gathered from the official records on the official websites. A wide and deep literature review was conducted for the MADM models and for the hydropower industry. The attributes of the model were identified, selected, clustered and evaluated by the expert decision maker (EDM) opinion and by help of an open source search results clustering engine (Carrot2) (helpful for also comprehension). The PHPI options were clustered according to their installed capacities main property to analyze the options in the most appropriate, decidable, informative, understandable and meaningful way. A simple clustering algorithm for the PHPI options was executed in the current study. A template model for the selection of the most satisfactory PHPI options was built in the DEXi (Decision EXpert for Education) and the DEXiTree software.Findings: The basic attributes for the selection of the PHPI options were presented and afterwards the aggregate attributes were defined by the bottom-up structuring for the early investment stages. The attributes were also analyzed by help of Carrot2. The most satisfactory PHPI options in Turkey in the big options data set were selected for each PHPI options cluster by the EDM evaluations in the DEXi.Originality/value: The recommended DEXi PHPI selection model by the search results clustering engine within a country wise case offered the possibility of easy, meaningful and satisfying continental or worldwide applications for the private investors and the international financial institutions such as the African Development Bank, or the World Bank was the main contribution.


Sign in / Sign up

Export Citation Format

Share Document