Online Clustering Algorithm for Restructuring User Web Search Results

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.

Download Full-text

A Roadmap to Integrate Document Clustering in Information Retrieval

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2011010103 ◽

2011 ◽

Vol 1 (1) ◽

pp. 31-44 ◽

Cited By ~ 1

Author(s):

R. Subhashini ◽

V.Jawahar Senthil Kumar

Keyword(s):

Information Retrieval ◽

Search Engines ◽

Clustering Algorithm ◽

Web Search ◽

Full Potential ◽

Digital Information ◽

Enabling Technology ◽

Clustering Techniques ◽

Search Results ◽

The World

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method s feasibility and effectiveness.

Download Full-text

Clustering of the Web Search Results in Educational Recommender Systems

Educational Recommender Systems and Technologies ◽

10.4018/978-1-61350-489-5.ch007 ◽

2012 ◽

pp. 154-181 ◽

Cited By ~ 12

Author(s):

Constanta-Nicoleta Bodea ◽

Maria-Iuliana Dascalu ◽

Adina Lipai

Keyword(s):

Recommender Systems ◽

Clustering Algorithm ◽

Web Search ◽

Web Pages ◽

Lexical Database ◽

Assessment Task ◽

Search Results ◽

Meta Search ◽

Search Approach ◽

The Web

This chapter presents a meta-search approach, meant to deliver bibliography from the internet, according to trainees’ results obtained at an e-assessment task. The bibliography consists of web pages related to the knowledge gaps of the trainees. The meta-search engine is part of an education recommender system, attached to an e-assessment application for project management knowledge. Meta-search means that, for a specific query (or mistake made by the trainee), several search mechanisms for suitable bibliography (further reading) could be applied. The lists of results delivered by the standard search mechanisms are used to build thematically homogenous groups using an ontology-based clustering algorithm. The clustering process uses an educational ontology and WordNet lexical database to create its categories. The research is presented in the context of recommender systems and their various applications to the education domain.

Download Full-text

CQIG: An Improved Web Search Results Clustering Algorithm

2010 Seventh Web Information Systems and Applications Conference ◽

10.1109/wisa.2010.36 ◽

2010 ◽

Author(s):

Yong-gong Ren ◽

Dan Fan

Keyword(s):

Clustering Algorithm ◽

Web Search ◽

Search Results ◽

Search Results Clustering

Download Full-text

Improving the Retrieval of Arabic Web Search Results Using Enhanced k-Means Clustering Algorithm

Entropy ◽

10.3390/e23040449 ◽

2021 ◽

Vol 23 (4) ◽

pp. 449

Author(s):

Amjad F. Alsuhaim ◽

Aqil M. Azmi ◽

Muhammad Hussain

Keyword(s):

Information Retrieval ◽

Execution Time ◽

Clustering Algorithm ◽

Web Search ◽

Writing Style ◽

Search Query ◽

Search Results ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Ranked List

Traditional information retrieval systems return a ranked list of results to a user’s query. This list is often long, and the user cannot explore all the results retrieved. It is also ineffective for a highly ambiguous language such as Arabic. The modern writing style of Arabic excludes the diacritical marking, without which Arabic words become ambiguous. For a search query, the user has to skim over the document to infer if the word has the same meaning they are after, which is a time-consuming task. It is hoped that clustering the retrieved documents will collate documents into clear and meaningful groups. In this paper, we use an enhanced k-means clustering algorithm, which yields a faster clustering time than the regular k-means. The algorithm uses the distance calculated from previous iterations to minimize the number of distance calculations. We propose a system to cluster Arabic search results using the enhanced k-means algorithm, labeling each cluster with the most frequent word in the cluster. This system will help Arabic web users identify each cluster’s topic and go directly to the required cluster. Experimentally, the enhanced k-means algorithm reduced the execution time by 60% for the stemmed dataset and 47% for the non-stemmed dataset when compared to the regular k-means, while slightly improving the purity.

Download Full-text

A study on clustering algorithm of Web search results based on rough set

2013 IEEE 4th International Conference on Software Engineering and Service Science ◽

10.1109/icsess.2013.6615308 ◽

2013 ◽

Cited By ~ 1

Author(s):

Jin Zhang ◽

Shuxuan Chen

Keyword(s):

Rough Set ◽

Clustering Algorithm ◽

Web Search ◽

Search Results

Download Full-text

Clustering web search results using Wikipedia resource

Computer Science and Mathematical Modelling ◽

10.5604/01.3001.0014.4437 ◽

2020 ◽

Vol 0 (10/2019) ◽

pp. 25-29

Author(s):

Chung Tran ◽

Andrzej Ameljańczyk

Keyword(s):

Clustering Algorithm ◽

Web Search ◽

New Method ◽

Affinity Propagation ◽

Search Results ◽

Knowledge Resource ◽

Affinity Propagation Clustering ◽

Popular Knowledge ◽

Global Performance ◽

Clustering Search

The paper presents a proposal of a new method for clustering search results. The method uses an external knowledge resource, which can be, for example, Wikipedia. Wikipedia – the largest encyclopedia, is a free and popular knowledge resource which is used to extract topics from short texts. Similarities between documents are calculated based on the similarities between these topics. After that, affinity propagation clustering algorithm is employed to cluster web search results. Proposed method is tested by AMBIENT dataset and evaluated within the experimental framework provided by a SemEval-2013 task. The paper also suggests new method to compare global performance of algorithms using multi – criteria analysis.

Download Full-text

How the interface design influences users' spontaneous trustworthiness evaluations of web search results

Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications - ETRA '10 ◽

10.1145/1743666.1743736 ◽

2010 ◽

Cited By ~ 16

Author(s):

Yvonne Kammerer ◽

Peter Gerjets

Keyword(s):

Interface Design ◽

Web Search ◽

Search Results

Download Full-text

The Matter of Chance: Auditing Web Search Results Related to the 2020 U.S. Presidential Primary Elections Across Six Search Engines

Social Science Computer Review ◽

10.1177/08944393211006863 ◽

2021 ◽

pp. 089443932110068

Author(s):

Aleksandra Urman ◽

Mykola Makhortykh ◽

Roberto Ulloa

Keyword(s):

Search Engine ◽

Search Engines ◽

Large Scale ◽

Web Search ◽

Primary Elections ◽

Virtual Agents ◽

Search Results ◽

Presidential Primary ◽

Large Scale Analysis ◽

Algorithmic Information

We examine how six search engines filter and rank information in relation to the queries on the U.S. 2020 presidential primary elections under the default—that is nonpersonalized—conditions. For that, we utilize an algorithmic auditing methodology that uses virtual agents to conduct large-scale analysis of algorithmic information curation in a controlled environment. Specifically, we look at the text search results for “us elections,” “donald trump,” “joe biden,” “bernie sanders” queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex, during the 2020 primaries. Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents using the same search engine. It highlights that whether users see certain information is decided by chance due to the inherent randomization of search results. We also find that some search engines prioritize different categories of information sources with respect to specific candidates. These observations demonstrate that algorithmic curation of political information can create information inequalities between the search engine users even under nonpersonalized conditions. Such inequalities are particularly troubling considering that search results are highly trusted by the public and can shift the opinions of undecided voters as demonstrated by previous research.

Download Full-text