Clustering of the Web Search Results in Educational Recommender Systems

This chapter presents a meta-search approach, meant to deliver bibliography from the internet, according to trainees’ results obtained at an e-assessment task. The bibliography consists of web pages related to the knowledge gaps of the trainees. The meta-search engine is part of an education recommender system, attached to an e-assessment application for project management knowledge. Meta-search means that, for a specific query (or mistake made by the trainee), several search mechanisms for suitable bibliography (further reading) could be applied. The lists of results delivered by the standard search mechanisms are used to build thematically homogenous groups using an ontology-based clustering algorithm. The clustering process uses an educational ontology and WordNet lexical database to create its categories. The research is presented in the context of recommender systems and their various applications to the education domain.

Download Full-text

A Roadmap to Integrate Document Clustering in Information Retrieval

Information Retrieval Methods for Multidisciplinary Applications ◽

10.4018/978-1-4666-3898-3.ch003 ◽

2013 ◽

pp. 31-45

Author(s):

R. Subhashini ◽

V.Jawahar Senthil Kumar

Keyword(s):

Information Retrieval ◽

Search Engines ◽

World Wide ◽

Clustering Algorithm ◽

Web Search ◽

Full Potential ◽

Digital Information ◽

Search Results ◽

The World ◽

The Web

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.

Download Full-text

Ontology-Based Clustering of the Web Meta-Search Results

Machine Learning Algorithms for Problem Solving in Computational Applications ◽

10.4018/978-1-4666-1833-6.ch019 ◽

2012 ◽

pp. 332-356

Author(s):

Constanta-Nicoleta Bodea ◽

Adina Lipai ◽

Maria-Iuliana Dascalu

Keyword(s):

Search Engines ◽

Lexical Database ◽

Search Results ◽

Concept Space ◽

Ontological Approach ◽

Search Tool ◽

Meta Search ◽

Specific Interests ◽

The Web ◽

Initial List

The chapter presents a meta-search tool developed in order to deliver search results structured according to the specific interests of users. Meta-search means that for a specific query, several search mechanisms could be simultaneously applied. Using the clustering process, thematically homogenous groups are built up from the initial list provided by the standard search mechanisms. The results are more user oriented, as a result of the ontological approach of the clustering process. After the initial search made on multiple search engines, the results are pre-processed and transformed into vectors of words. These vectors are mapped into vectors of concepts, by calling an educational ontology and using the WordNet lexical database. The vectors of concepts are refined through concept space graphs and projection mechanisms, before applying the clustering procedure. Implementation details and early experimentation results are also provided.

Download Full-text

An Ontology-Based Search Tool in the Semantic Web

Advancing Information Management through Semantic Web Concepts and Ontologies ◽

10.4018/978-1-4666-2494-8.ch012 ◽

2013 ◽

pp. 221-249 ◽

Cited By ~ 1

Author(s):

Constanta-Nicoleta Bodea ◽

Adina Lipai ◽

Maria-Iuliana Dascalu

Keyword(s):

Semantic Web ◽

Lexical Database ◽

Search Results ◽

Concept Space ◽

Ontological Approach ◽

Search Tool ◽

Clustering Search ◽

Meta Search ◽

Specific Interests ◽

Initial List

The chapter presents a meta-search tool developed in order to deliver search results structured according to the specific interests of users. Meta-search means that for a specific query, several search mechanisms could be simultaneously applied. Using the clustering process, thematically homogenous groups are built up from the initial list provided by the standard search mechanisms. The results are more user-oriented, thanks to the ontological approach of the clustering process. After the initial search made on multiple search engines, the results are pre-processed and transformed into vectors of words. These vectors are mapped into vectors of concepts, by calling an educational ontology and using the WordNet lexical database. The vectors of concepts are refined through concept space graphs and projection mechanisms, before applying the clustering procedure. The chapter describes the proposed solution in the framework of other existent clustering search solutions. Implementation details and early experimentation results are also provided.

Download Full-text

AntWeb—Web Search Based on Ant Behavior

Emerging Technologies of Text Mining ◽

10.4018/978-1-59904-373-9.ch010 ◽

2008 ◽

pp. 208-222

Author(s):

Li Weigang ◽

Wu Man Qi

Keyword(s):

Web Mining ◽

Web Search ◽

Theory Model ◽

Web Pages ◽

Web Portal ◽

Knowledge Based ◽

Log Files ◽

Ant Behavior ◽

Shortest Route ◽

The Web

This chapter presents a study of Ant Colony Optimization (ACO) to Interlegis Web portal, Brazilian legislation Website. The approach of AntWeb is inspired by ant colonies foraging behavior to adaptively mark the most significant link by means of the shortest route to arrive the target pages. The system considers the users in the Web portal as artificial ants and the links among the pages of the Web pages as the researching network. To identify the group of the visitors, Web mining is applied to extract knowledge based on preprocessing Web log files. The chapter describes the theory, model, main utilities and implementation of AntWeb prototype in Interlegis Web portal. The case study shows Off-line Web mining; simulations without and with the use of AntWeb; testing by modification of the parameters. The result demonstrates the sensibility and accessibility of AntWeb and the benefits for the Interlegis Web users.

Download Full-text

Enhancing Web Search through Web Structure Mining

Encyclopedia of Data Warehousing and Mining ◽

10.4018/978-1-59140-557-3.ch084 ◽

2011 ◽

pp. 443-447

Author(s):

Ji-Rong Wen

Keyword(s):

Information Retrieval ◽

Web Search ◽

Product Information ◽

Semantic Representation ◽

Web Pages ◽

Search Performance ◽

Information Display ◽

Web Structure Mining ◽

Free Environment ◽

The Web

The Web is an open and free environment for people to publish and get information. Everyone on the Web can be either an author, a reader, or both. The language of the Web, HTML (Hypertext Markup Language), is mainly designed for information display, not for semantic representation. Therefore, current Web search engines usually treat Web pages as unstructured documents, and traditional information retrieval (IR) technologies are employed for Web page parsing, indexing, and searching. The unstructured essence of Web pages seriously blocks more accurate search and advanced applications on the Web. For example, many sites contain structured information about various products. Extracting and integrating product information from multiple Web sites could lead to powerful search functions, such as comparison shopping and business intelligence. However, these structured data are embedded in Web pages, and there are no proper traditional methods to extract and integrate them. Another example is the link structure of the Web. If used properly, information hidden in the links could be taken advantage of to effectively improve search performance and make Web search go beyond traditional information retrieval (Page, Brin, Motwani, & Winograd, 1998, Kleinberg, 1998).

Download Full-text

Visualizing the web search results with web search visualization using scatter plot

2010 IEEE 2nd Symposium on Web Society ◽

10.1109/sws.2010.5607488 ◽

2010 ◽

Cited By ~ 2

Author(s):

Md. Renesa Nizamee ◽

Md. Asaduzzaman Shojib

Keyword(s):

Web Search ◽

Scatter Plot ◽

Search Results ◽

The Web

Download Full-text

What is popular on Wikipedia and why?

First Monday ◽

10.5210/fm.v12i4.1765 ◽

2007 ◽

Cited By ~ 29

Author(s):

Anselm Spoerri

Keyword(s):

Search Engines ◽

Web Search ◽

Search Behavior ◽

Search Queries ◽

Search Results ◽

The Web

This paper analyzes which pages and topics are the most popular on Wikipedia and why. For the period of September 2006 to January 2007, the 100 most visited Wikipedia pages in a month are identified and categorized in terms of the major topics of interest. The observed topics are compared with search behavior on the Web. Search queries, which are identical to the titles of the most popular Wikipedia pages, are submitted to major search engines and the positions of popular Wikipedia pages in the top 10 search results are determined. The presented data helps to explain how search engines, and Google in particular, fuel the growth and shape what is popular on Wikipedia.

Download Full-text

BEYOND RANKED LISTS IN WEB SEARCH: AGGREGATING WEB CONTENT INTO TOPIC PAGES

International Journal of Semantic Computing ◽

10.1142/s1793351x10001103 ◽

2010 ◽

Vol 04 (04) ◽

pp. 509-534 ◽

Cited By ~ 3

Author(s):

NIRANJAN BALASUBRAMANIAN ◽

SILVIU CUCERZAN

Keyword(s):

Web Search ◽

Automatic Generation ◽

Selection Method ◽

Web Content ◽

Search Results ◽

Aggregate Information ◽

Search Logs ◽

The Web

We investigate the automatic generation of topic pages as an alternative to the current Web search paradigm. Topic pages explicitly aggregate information across documents, filter redundancy, and promote diversity of topical aspects. We propose a novel framework for building rich topical aspect models and selecting diverse information from the Web. In particular, we use Web search logs to build aspect models with various degrees of specificity, and then employ these aspect models as input to a sentence selection method that identifies relevant and non-redundant sentences from the Web. Automatic and manual evaluations on biographical topics show that topic pages built by our system compare favorably to regular Web search results and to MDS-style summaries of the Web results on all metrics employed.

Download Full-text

WEBYACHT: A CONCEPT-BASED SEARCH TOOL FOR WWW

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213099000105 ◽

1999 ◽

Vol 08 (02) ◽

pp. 137-156 ◽

Cited By ~ 1

Author(s):

CHING-CHI HSU ◽

CHIA-HUI CHANG

Keyword(s):

Information Search ◽

Relevance Feedback ◽

Web Search ◽

Automatic Assessment ◽

Feedback Mechanisms ◽

Document Ranking ◽

Search Results ◽

Web Information ◽

Search Tool ◽

The Web

This paper describes a Web information search tool called WebYacht. The goal of WebYacht is to solve the problem of imprecise search results in current Web search engines. Due to incomplete information given by users and the diversified information published on the Web, conventional document ranking based on an automatic assessment of document relevance to the query may not be the best approach when little information is given as in most cases. In order to clarify the ambiguity of the short queries given by users, WebYacht adopts cluster-based browsing model as well as relevance feedback to facilitate Web information search. The idea is to have users give two to three times more feedback in the same amount of time that would be required to give feedback for conventional feedback mechanisms. With the assistance of cluster-based representation provided by WebYacht, a lot of browsing labor can be reduced. In this paper, we explain the techniques used in the design of WebYacht and compare the performances of feedback interface designs and to conventional similarity ranking search results.

Download Full-text

WEB GRAPH BASED SEARCH BY USING DENSITY OF KEYWORD AND AGE FACTOR

International Journal of Computer Science and Informatics ◽

10.47893/ijcsi.2013.1124 ◽

2013 ◽

pp. 89-93

Author(s):

GAURAV AGARWAL ◽

SACHI GUPTA ◽

SAURABH MUKHERJEE

Keyword(s):

Search Engine ◽

Web Search ◽

Web Pages ◽

Main Role ◽

Ranking Algorithm ◽

Web Page ◽

Web Crawler ◽

User Requirement ◽

Priority Assignment ◽

The Web

Today, web servers, are the key repositories of the information & internet is the source of getting this information. There is a mammoth data on the Internet. It becomes a difficult job to search out the accordant data. Search Engine plays a vital role in searching the accordant data. A search engine follows these steps: Web crawling by crawler, Indexing by Indexer and Searching by Searcher. Web crawler retrieves information of the web pages by following every link on the site. Which is stored by web search engine then the content of the web page is indexed by the indexer. The main role of indexer is how data can be catch soon as per user requirements. As the client gives a query, Search Engine searches the results corresponding to this query to provide excellent output. Here ambition is to enroot an algorithm for search engine which may response most desirable result as per user requirement. In this a ranking method is used by the search engine to rank the web pages. Various ranking approaches are discussed in literature but in this paper, ranking algorithm is proposed which is based on parent-child relationship. Proposed ranking algorithm is based on priority assignment phase of Heterogeneous Earliest Finish Time (HEFT) Algorithm which is designed for multiprocessor task scheduling. Proposed algorithm works on three on range variable its means the density of keywords, number of successors to the nodes and the age of the web page. Density shows the occurrence of the keyword on the particular web page. Numbers of successors represent the outgoing link to a single web page. Age is the freshness value of the web page. The page which is modified recently is the freshest page and having the smallest age or largest freshness value. Proposed Technique requires that the priorities of each page to be set with the downward rank values & pages are arranged in ascending/ Descending order of their rank values. Experiments show that our algorithm is valuable. After the comparison with Google we find that our Algorithm is performing better. For 70% problems our algorithm is working better than Google.

Download Full-text