CQIG: An Improved Web Search Results Clustering Algorithm

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.

Download Full-text

Interactive system based on web search results clustering for Arabic query reformulation

2014 Third IEEE International Colloquium in Information Science and Technology (CIST) ◽

10.1109/cist.2014.7016636 ◽

2014 ◽

Cited By ~ 1

Author(s):

Issam Sahmoudi ◽

Abdelmonaime Lachkar

Keyword(s):

Web Search ◽

Interactive System ◽

Query Reformulation ◽

Search Results ◽

Search Results Clustering

Download Full-text

Graph-Based Concept Clustering for Web Search Results

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v5i6.pp1536-1544 ◽

2015 ◽

Vol 5 (6) ◽

pp. 1536 ◽

Cited By ~ 1

Author(s):

Supakpong Jinarat ◽

Choochart Haruechaiyasak ◽

Arnon Rungsawang

Keyword(s):

Search Engine ◽

Web Search ◽

Knowledge Source ◽

Short Text ◽

Clustering Techniques ◽

External Knowledge ◽

Search Results ◽

Clustering Quality ◽

Informative Content ◽

Search Results Clustering

A search engine usually returns a long list of web search results corresponding to a query from the user. Users must spend a lot of time for browsing and navigating the search results for the relevant results. Many research works applied the text clustering techniques, called web search results clustering, to handle the problem. Unfortunately, search result document returned from search engine is a very short text. It is difficult to cluster related documents into the same group because a short document has low informative content. In this paper, we proposed a method to cluster the web search results with high clustering quality using graph-based clustering with concept which extract from the external knowledge source. The main idea is to expand the original search results with some related concept terms. We applied the Wikipedia as the external knowledge source for concept extraction. We compared the clustering results of our proposed method with two well-known search results clustering techniques, Suffix Tree Clustering and Lingo. The experimental results showed that our proposed method significantly outperforms over the well-known clustering techniques.

Download Full-text

A Roadmap to Integrate Document Clustering in Information Retrieval

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2011010103 ◽

2011 ◽

Vol 1 (1) ◽

pp. 31-44 ◽

Cited By ~ 1

Author(s):

R. Subhashini ◽

V.Jawahar Senthil Kumar

Keyword(s):

Information Retrieval ◽

Search Engines ◽

Clustering Algorithm ◽

Web Search ◽

Full Potential ◽

Digital Information ◽

Enabling Technology ◽

Clustering Techniques ◽

Search Results ◽

The World

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method s feasibility and effectiveness.

Download Full-text

Clustering of the Web Search Results in Educational Recommender Systems

Educational Recommender Systems and Technologies ◽

10.4018/978-1-61350-489-5.ch007 ◽

2012 ◽

pp. 154-181 ◽

Cited By ~ 12

Author(s):

Constanta-Nicoleta Bodea ◽

Maria-Iuliana Dascalu ◽

Adina Lipai

Keyword(s):

Recommender Systems ◽

Clustering Algorithm ◽

Web Search ◽

Web Pages ◽

Lexical Database ◽

Assessment Task ◽

Search Results ◽

Meta Search ◽

Search Approach ◽

The Web

This chapter presents a meta-search approach, meant to deliver bibliography from the internet, according to trainees’ results obtained at an e-assessment task. The bibliography consists of web pages related to the knowledge gaps of the trainees. The meta-search engine is part of an education recommender system, attached to an e-assessment application for project management knowledge. Meta-search means that, for a specific query (or mistake made by the trainee), several search mechanisms for suitable bibliography (further reading) could be applied. The lists of results delivered by the standard search mechanisms are used to build thematically homogenous groups using an ontology-based clustering algorithm. The clustering process uses an educational ontology and WordNet lexical database to create its categories. The research is presented in the context of recommender systems and their various applications to the education domain.

Download Full-text

Exploring Web Search Results Clustering

Research and Development in Intelligent Systems XXIII ◽

10.1007/978-1-84628-663-6_30 ◽

2007 ◽

pp. 393-397 ◽

Cited By ~ 2

Author(s):

Xiaoxia Wang ◽

Max Bramer

Keyword(s):

Web Search ◽

Search Results ◽

Search Results Clustering

Download Full-text

Query log driven web search results clustering

Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval - SIGIR '14 ◽

10.1145/2600428.2609583 ◽

2014 ◽

Cited By ~ 12

Author(s):

Jose G. Moreno ◽

Gaël Dias ◽

Guillaume Cleuziou

Keyword(s):

Web Search ◽

Query Log ◽

Search Results ◽

Search Results Clustering

Download Full-text

A qualitative multi-attribute model for the selection of the private hydropower plant investments in Turkey: By foundation of the search results clustering engine (Carrot2), hydropower plant clustering, DEXi and DEXiTree

Journal of Industrial Engineering and Management ◽

10.3926/jiem.1142 ◽

2016 ◽

Vol 9 (1) ◽

pp. 152

Author(s):

Burak Omer Saracoglu

Keyword(s):

Electricity Generation ◽

Clustering Algorithm ◽

Main Property ◽

Electricity Demand ◽

Hydropower Plant ◽

Data Set ◽

Search Results ◽

Private Investors ◽

Selection Of ◽

Search Results Clustering

Purpose: The electricity demand in Turkey has been increasing for a while. Hydropower is one of the major electricity generation types to compensate this electricity demand in Turkey. Private investors (domestic and foreign) in the hydropower electricity generation sector have been looking for the most appropriate and satisfactory new private hydropower investment (PHPI) options and opportunities in Turkey. This study aims to present a qualitative multi-attribute decision making (MADM) model, that is easy, straightforward, and fast for the selection of the most satisfactory reasonable PHPI options during the very early investment stages (data and information poorness on projects).Design/methodology/approach: The data and information of the PHPI options was gathered from the official records on the official websites. A wide and deep literature review was conducted for the MADM models and for the hydropower industry. The attributes of the model were identified, selected, clustered and evaluated by the expert decision maker (EDM) opinion and by help of an open source search results clustering engine (Carrot2) (helpful for also comprehension). The PHPI options were clustered according to their installed capacities main property to analyze the options in the most appropriate, decidable, informative, understandable and meaningful way. A simple clustering algorithm for the PHPI options was executed in the current study. A template model for the selection of the most satisfactory PHPI options was built in the DEXi (Decision EXpert for Education) and the DEXiTree software.Findings: The basic attributes for the selection of the PHPI options were presented and afterwards the aggregate attributes were defined by the bottom-up structuring for the early investment stages. The attributes were also analyzed by help of Carrot2. The most satisfactory PHPI options in Turkey in the big options data set were selected for each PHPI options cluster by the EDM evaluations in the DEXi.Originality/value: The recommended DEXi PHPI selection model by the search results clustering engine within a country wise case offered the possibility of easy, meaningful and satisfying continental or worldwide applications for the private investors and the international financial institutions such as the African Development Bank, or the World Bank was the main contribution.

Download Full-text