scholarly journals A Context Model For Focused Web Search

2012 ◽  
Vol 2 (3) ◽  
pp. 155-162 ◽  
Author(s):  
Sushil Kumar ◽  
Naresh Chauhan

In the existing web search systems, the information retrieval isperformed using a single query and mapping it to a set ofdocuments. From a single query, however, the search systemscan only have very limited clue about the user‟s informationneed. The user‟s context and his environment are ignored whilesearching the information resulting in irrelevant search results.These irrelevant search results increase the cognitive overheadof the user in filtering them out and finding useful information.Therefore, the search systems must incorporate contextinformation regarding user and his environment search thehighly relevant web pages. This paper prepares an Entity-Centric model for the context and proposes a framework forcontext-aware focused web search system that considers thevarious context features and returns highly relevant searchresults to the user.

Author(s):  
R. Subhashini ◽  
V.Jawahar Senthil Kumar

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method’s feasibility and effectiveness.


Author(s):  
Daniel Crabtree

Web search engines help users find relevant web pages by returning a result set containing the pages that best match the user’s query. When the identified pages have low relevance, the query must be refined to capture the search goal more effectively. However, finding appropriate refinement terms is difficult and time consuming for users, so researchers developed query expansion approaches to identify refinement terms automatically. There are two broad approaches to query expansion, automatic query expansion (AQE) and interactive query expansion (IQE) (Ruthven et al., 2003). AQE has no user involvement, which is simpler for the user, but limits its performance. IQE has user involvement, which is more complex for the user, but means it can tackle more problems such as ambiguous queries. Searches fail by finding too many irrelevant pages (low precision) or by finding too few relevant pages (low recall). AQE has a long history in the field of information retrieval, where the focus has been on improving recall (Velez et al., 1997). Unfortunately, AQE often decreased precision as the terms used to expand a query often changed the query’s meaning (Croft and Harper (1979) identified this effect and named it query drift). The problem is that users typically consider just the first few results (Jansen et al., 2005), which makes precision vital to web search performance. In contrast, IQE has historically balanced precision and recall, leading to an earlier uptake within web search. However, like AQE, the precision of IQE approaches needs improvement. Most recently, approaches have started to improve precision by incorporating semantic knowledge.


Author(s):  
Ji-Rong Wen

The Web is an open and free environment for people to publish and get information. Everyone on the Web can be either an author, a reader, or both. The language of the Web, HTML (Hypertext Markup Language), is mainly designed for information display, not for semantic representation. Therefore, current Web search engines usually treat Web pages as unstructured documents, and traditional information retrieval (IR) technologies are employed for Web page parsing, indexing, and searching. The unstructured essence of Web pages seriously blocks more accurate search and advanced applications on the Web. For example, many sites contain structured information about various products. Extracting and integrating product information from multiple Web sites could lead to powerful search functions, such as comparison shopping and business intelligence. However, these structured data are embedded in Web pages, and there are no proper traditional methods to extract and integrate them. Another example is the link structure of the Web. If used properly, information hidden in the links could be taken advantage of to effectively improve search performance and make Web search go beyond traditional information retrieval (Page, Brin, Motwani, & Winograd, 1998, Kleinberg, 1998).


Author(s):  
Wen-Chen Hu ◽  
Hung-Jen Yang ◽  
Jyh-haw Yeh ◽  
Chung-wei Lee

The World Wide Web now holds more than six billion pages covering almost all daily issues. The Web’s fast growing size and lack of structural style present a new challenge for information retrieval (Lawrence & Giles, 1999a). Traditional search techniques are based on users typing in search keywords which the search services can then use to locate the desired Web pages. However, this approach normally retrieves too many documents, of which only a small fraction are relevant to the users’ needs. Furthermore, the most relevant documents do not necessarily appear at the top of the query output list. Numerous search technologies have been applied to Web search engines; however, the dominant search methods have yet to be identified. This article provides an overview of the existing technologies for Web search engines and classifies them into six categories: i) hyperlink exploration, ii) information retrieval, iii) metasearches, iv) SQL approaches, v) content-based multimedia searches, and vi) others. At the end of this article, a comparative study of major commercial and experimental search engines is presented, and some future research directions for Web search engines are suggested. Related Web search technology review can also be found in Arasu, Cho, Garcia-Molina, Paepcke, and Raghavan (2001) and Lawrence and Giles (1999b).


2011 ◽  
Vol 1 (1) ◽  
pp. 31-44 ◽  
Author(s):  
R. Subhashini ◽  
V.Jawahar Senthil Kumar

The World Wide Web is a large distributed digital information space. The ability to search and retrieve information from the Web efficiently and effectively is an enabling technology for realizing its full potential. Information Retrieval (IR) plays an important role in search engines. Today’s most advanced engines use the keyword-based (“bag of words”) paradigm, which has inherent disadvantages. Organizing web search results into clusters facilitates the user’s quick browsing of search results. Traditional clustering techniques are inadequate because they do not generate clusters with highly readable names. This paper proposes an approach for web search results in clustering based on a phrase based clustering algorithm. It is an alternative to a single ordered result of search engines. This approach presents a list of clusters to the user. Experimental results verify the method s feasibility and effectiveness.


Author(s):  
Constanta-Nicoleta Bodea ◽  
Maria-Iuliana Dascalu ◽  
Adina Lipai

This chapter presents a meta-search approach, meant to deliver bibliography from the internet, according to trainees’ results obtained at an e-assessment task. The bibliography consists of web pages related to the knowledge gaps of the trainees. The meta-search engine is part of an education recommender system, attached to an e-assessment application for project management knowledge. Meta-search means that, for a specific query (or mistake made by the trainee), several search mechanisms for suitable bibliography (further reading) could be applied. The lists of results delivered by the standard search mechanisms are used to build thematically homogenous groups using an ontology-based clustering algorithm. The clustering process uses an educational ontology and WordNet lexical database to create its categories. The research is presented in the context of recommender systems and their various applications to the education domain.


Author(s):  
Cédric Pruski ◽  
Nicolas Guelfi ◽  
Chantal Reynaud

Finding relevant information on the Web is difficult for most users. Although Web search applications are improving, they must be more “intelligent” to adapt to the search domains targeted by queries, the evolution of these domains, and users’ characteristics. In this paper, the authors present the TARGET framework for Web Information Retrieval. The proposed approach relies on the use of ontologies of a particular nature, called adaptive ontologies, for representing both the search domain and a user’s profile. Unlike existing approaches on ontologies, the authors make adaptive ontologies adapt semi-automatically to the evolution of the modeled domain. The ontologies and their properties are exploited for domain specific Web search purposes. The authors propose graph-based data structures for enriching Web data in semantics, as well as define an automatic query expansion technique to adapt a query to users’ real needs. The enriched query is evaluated on the previously defined graph-based data structures representing a set of Web pages returned by a usual search engine in order to extract the most relevant information according to user needs. The overall TARGET framework is formalized using first-order logic and fully tool supported.


2012 ◽  
pp. 217-238 ◽  
Author(s):  
Orland Hoeber

People commonly experience difficulties when searching the Web, arising from an incomplete knowledge regarding their information needs, an inability to formulate accurate queries, and a low tolerance for considering the relevance of the search results. While simple and easy to use interfaces have made Web search universally accessible, they provide little assistance for people to overcome the difficulties they experience when their information needs are more complex than simple fact-verification. In human-centred Web search, the purpose of the search engine expands from a simple information retrieval engine to a decision support system. People are empowered to take an active role in the search process, with the search engine supporting them in developing a deeper understanding of their information needs, assisting them in crafting and refining their queries, and aiding them in evaluating and exploring the search results. In this chapter, recent research in this domain is outlined and discussed.


2014 ◽  
Vol 971-973 ◽  
pp. 1870-1873
Author(s):  
Xiao Gang Dong

Web search engine based on DNS, the standard proposed solution of IETF for public web search system, is introduced in this paper. Now no web search engine can cover more than 60 percent of all the pages on Internet. The update interval of most pages database is almost one month. This condition hasn't changed for many years. Converge and recency problems have become the bottleneck problem of current web search engine. To solve these problems, a new system, search engine based on DNS is proposed in this paper. This system adopts the hierarchical distributed architecture like DNS, which is different from any current commercial search engine. In theory, this system can cover all the web pages on Internet. Its update interval could even be one day. The original idea, detailed content and implementation of this system all are introduced in this paper.


Author(s):  
Yan Chen ◽  
Yan-Qing Zhang

For most Web searching applications, queries are commonly ambiguous because words or phrases have different linguistic meanings for different Web users. The conventional keyword-based search engines cannot disambiguate queries to provide relevant results matching Web users’ intents. Traditional Word Sense Disambiguation (WSD) methods use statistic models or ontology-based knowledge systems to measure associations among words. The contexts of queries are used for disambiguation in these methods. However, due to the fact that numerous combinations of words may appear in queries and documents, it is difficult to extract concepts’ relations for all possible combinations. Moreover, queries are usually short, so contexts in queries do not always provide enough information to disambiguate queries. Therefore, the traditional WSD methods are not sufficient to provide accurate search results for ambiguous queries. In this chapter, a new model, Granular Semantic Tree (GST), is introduced for more conveniently representing associations among concepts than the traditional WSD methods. Additionally, users’ preferences are used to provide personalized search results that better adapt to users’ unique intents. Fuzzy logic is used to determine the most appropriate concepts related to queries based on contexts and users’ preferences. Finally, Web pages are analyzed by the GST model. The concepts of pages for the queries are evaluated, and the pages are re-ranked according to similarities of concepts between pages and queries.


Sign in / Sign up

Export Citation Format

Share Document