Clustering of web search results using Suffix tree algorithm and avoidance of repetition of same images in search results using L-Point Comparison algorithm

Clustering web search results is a kind of solution which help user to find the interested topic by grouping the search results. This paper presents an improved method for clustering search results focused on Chinese web pages. The main contributions of this paper are the following: First, in this paper, a method which identifies the complete semantic information phrase by comparing the attributes of base clusters in the suffix tree document model and the overlap of their document sets is presented. Second, by analyzing the content and structure of title and snippet of Chinese web search results, one way of sentence segmentation is designed and implemented to constructing suffix tree. Third, In order to better respond to the associate degree of terms, a novel method is proposed which compute the distance in sentence-grain of terms' co-occurrences. Finally, the experiment illustrates that the new clustering method provides an efficient and effective way for user browsing and locating sought information.

Download Full-text

A New Suffix Tree Similarity Measure and Labeling for Web Search Results Clustering

2009 Second International Conference on Emerging Trends in Engineering & Technology ◽

10.1109/icetet.2009.13 ◽

2009 ◽

Cited By ~ 2

Author(s):

Archana Kale ◽

Ujwala Bharambe ◽

M. SashiKumar

Keyword(s):

Similarity Measure ◽

Suffix Tree ◽

Web Search ◽

Search Results ◽

Tree Similarity ◽

Search Results Clustering

Download Full-text

How the interface design influences users' spontaneous trustworthiness evaluations of web search results

Proceedings of the 2010 Symposium on Eye-Tracking Research & Applications - ETRA '10 ◽

10.1145/1743666.1743736 ◽

2010 ◽

Cited By ~ 16

Author(s):

Yvonne Kammerer ◽

Peter Gerjets

Keyword(s):

Interface Design ◽

Web Search ◽

Search Results

Download Full-text

The Matter of Chance: Auditing Web Search Results Related to the 2020 U.S. Presidential Primary Elections Across Six Search Engines

Social Science Computer Review ◽

10.1177/08944393211006863 ◽

2021 ◽

pp. 089443932110068

Author(s):

Aleksandra Urman ◽

Mykola Makhortykh ◽

Roberto Ulloa

Keyword(s):

Search Engine ◽

Search Engines ◽

Large Scale ◽

Web Search ◽

Primary Elections ◽

Virtual Agents ◽

Search Results ◽

Presidential Primary ◽

Large Scale Analysis ◽

Algorithmic Information

We examine how six search engines filter and rank information in relation to the queries on the U.S. 2020 presidential primary elections under the default—that is nonpersonalized—conditions. For that, we utilize an algorithmic auditing methodology that uses virtual agents to conduct large-scale analysis of algorithmic information curation in a controlled environment. Specifically, we look at the text search results for “us elections,” “donald trump,” “joe biden,” “bernie sanders” queries on Google, Baidu, Bing, DuckDuckGo, Yahoo, and Yandex, during the 2020 primaries. Our findings indicate substantial differences in the search results between search engines and multiple discrepancies within the results generated for different agents using the same search engine. It highlights that whether users see certain information is decided by chance due to the inherent randomization of search results. We also find that some search engines prioritize different categories of information sources with respect to specific candidates. These observations demonstrate that algorithmic curation of political information can create information inequalities between the search engine users even under nonpersonalized conditions. Such inequalities are particularly troubling considering that search results are highly trusted by the public and can shift the opinions of undecided voters as demonstrated by previous research.

Download Full-text