Next Generation Search Engines
Latest Publications


TOTAL DOCUMENTS

20
(FIVE YEARS 0)

H-INDEX

3
(FIVE YEARS 0)

Published By IGI Global

9781466603301, 9781466603318

2012 ◽  
pp. 411-437
Author(s):  
Stéphane Chaudiron ◽  
Madjid Ihadjadene

This chapter shows that the wider use of Web search engines, reconsidering the theoretical and methodological frameworks to grasp new information practices. Beginning with an overview of the recent challenges implied by the dynamic nature of the Web, this chapter then traces the information behavior related concepts in order to present the different approaches from the user perspective. The authors pay special attention to the concept of “information practice” and other related concepts such as “use”, “activity”, and “behavior” largely used in the literature but not always strictly defined. The authors provide an overview of user-oriented studies that are meaningful to understand the different contexts of use of electronic information access systems, focusing on five approaches: the system-oriented approaches, the theories of information seeking, the cognitive and psychological approaches, the management science approaches, and the marketing approaches. Future directions of work are then shaped, including social searching and the ethical, cultural, and political dimensions of Web search engines. The authors conclude considering the importance of Critical theory to better understand the role of Web Search engines in our modern society.


2012 ◽  
pp. 386-409 ◽  
Author(s):  
Ourdia Bouidghaghen ◽  
Lynda Tamine

The explosion of the information available on the Internet has made traditional information retrieval systems, characterized by one size fits all approaches, less effective. Indeed, users are overwhelmed by the information delivered by such systems in response to their queries, particularly when the latter are ambiguous. In order to tackle this problem, the state-of-the-art reveals that there is a growing interest towards contextual information retrieval (CIR) which relies on various sources of evidence issued from the user’s search background and environment, in order to improve the retrieval accuracy. This chapter focuses on mobile context, highlights challenges they present for IR, and gives an overview of CIR approaches applied in this environment. Then, the authors present an approach to personalize search results for mobile users by exploiting both cognitive and spatio-temporal contexts. The experimental evaluation undertaken in front of Yahoo search shows that the approach improves the quality of top search result lists and enhances search result precision.


2012 ◽  
pp. 344-370
Author(s):  
Brigitte Grau

This chapter is dedicated to factual question answering, i.e., extracting precise and exact answers to question given in natural language from texts. A question in natural language gives more information than a bag of word query (i.e., a query made of a list of words), and provides clues for finding precise answers. The author first focuses on the presentation of the underlying problems mainly due to the existence of linguistic variations between questions and their answerable pieces of texts for selecting relevant passages and extracting reliable answers. The author first presents how to answer factual question in open domain. The author also presents answering questions in specialty domain as it requires dealing with semi-structured knowledge and specialized terminologies, and can lead to different applications, as information management in corporations for example. Searching answers on the Web constitutes another application frame and introduces specificities linked to Web redundancy or collaborative usage. Besides, the Web is also multilingual, and a challenging problem consists in searching answers in target language documents other than the source language of the question. For all these topics, this chapter presents main approaches and the remaining problems.


2012 ◽  
pp. 239-273
Author(s):  
Sarah Vert

This chapter focuses on the Internet working environment of Knowledge Workers through the customization of the Web browser on their computer. Given that a Web browser is designed to be used by anyone browsing the Internet, its initial configuration must meet generic needs such as reading a Web page, searching for information, and bookmarking. In the absence of a universal solution that meets the specific needs of each user, browser developers offer additional programs known as extensions, or add-ons. Among the various browsers that can be modified with add-ons, Mozilla’s Firefox is perhaps the one that first springs to mind; indeed, Mozilla has built the Firefox brand around these extensions. Using this example, and also considering the browsers Google Chrome, Internet Explorer, Opera and Safari, the author will attempt to demonstrate the potential of Web browsers in terms of the resources they can offer when they are customizable and available within the working environment of a Knowledge Worker.


2012 ◽  
pp. 174-190
Author(s):  
Michael W. Berry ◽  
Reed Esau ◽  
Bruce Kiefer

Electronic discovery (eDiscovery) is the process of collecting and analyzing electronic documents to determine their relevance to a legal matter. Office technology has advanced and eased the requirements necessary to create a document. As such, the volume of data has outgrown the manual processes previously used to make relevance judgments. Methods of text mining and information retrieval have been put to use in eDiscovery to help tame the volume of data; however, the results have been uneven. This chapter looks at the historical bias of the collection process. The authors examine how tools like classifiers, latent semantic analysis, and non-negative matrix factorization deal with nuances of the collection process.


2012 ◽  
pp. 138-173
Author(s):  
Edmond Lassalle ◽  
Emmanuel Lassalle

Robertson and Spärck Jones pioneered experimental probabilistic models (Binary Independence Model) with both a typology generalizing the Boolean model, a frequency counting to calculate elementary weightings, and their combination into a global probabilistic estimation. However, this model did not consider indexing terms dependencies. An extension to mixture models (e.g., using a 2-Poisson law) made it possible to take into account these dependencies from a macroscopic point of view (BM25), as well as a shallow linguistic processing of co-references. New approaches (language models, for example “bag of words” models, probabilistic dependencies between requests and documents, and consequently Bayesian inference using Dirichlet prior conjugate) furnished new solutions for documents structuring (categorization) and for index smoothing. Presently, in these probabilistic models the main issues have been addressed from a formal point of view only. Thus, linguistic properties are neglected in the indexing language. The authors examine how a linguistic and semantic modeling can be integrated in indexing languages and set up a hybrid model that makes it possible to deal with different information retrieval problems in a unified way.


Author(s):  
Weimao Ke

Amid the rapid growth of information today is the increasing challenge for people to navigate its magnitude. Dynamics and heterogeneity of large information spaces such as the Web raise important questions about information retrieval in these environments. Collection of all information in advance and centralization of IR operations are extremely difficult, if not impossible, because systems are dynamic and information is distributed. The chapter discusses some of the key issues facing classic information retrieval models and presents a decentralized, organic view of information systems pertaining to search in large scale networks. It focuses on the impact of network structure on search performance and discusses a phenomenon we refer to as the Clustering Paradox, in which the topology of interconnected systems imposes a scalability limit.


Author(s):  
Abhishek Das ◽  
Ankit Jain

In this chapter, the authors describe the key indexing components of today’s web search engines. As the World Wide Web has grown, the systems and methods for indexing have changed significantly. The authors present the data structures used, the features extracted, the infrastructure needed, and the options available for designing a brand new search engine. Techniques are highlighted that improve relevance of results, discuss trade-offs to best utilize machine resources, and cover distributed processing concepts in this context. In particular, the authors delve into the topics of indexing phrases instead of terms, storage in memory vs. on disk, and data partitioning. Some thoughts on information organization for the newly emerging data-forms conclude the chapter.


2012 ◽  
pp. 456-479 ◽  
Author(s):  
Dirk Lewandowski

This chapter presents a theoretical framework for evaluating next generation search engines. The author focuses on search engines whose results presentation is enriched with additional information and does not merely present the usual list of “10 blue links,” that is, of ten links to results, accompanied by a short description. While Web search is used as an example here, the framework can easily be applied to search engines in any other area. The framework not only addresses the results presentation, but also takes into account an extension of the general design of retrieval effectiveness tests. The chapter examines the ways in which this design might influence the results of such studies and how a reliable test is best designed.


2012 ◽  
pp. 291-303
Author(s):  
Ismaïl Biskri ◽  
Louis Rompré

In this paper the authors will present research on the combination of two methods of data mining: text classification and maximal association rules. Text classification has been the focus of interest of many researchers for a long time. However, the results take the form of lists of words (classes) that people often do not know what to do with. The use of maximal association rules induced a number of advantages: (1) the detection of dependencies and correlations between the relevant units of information (words) of different classes, (2) the extraction of hidden knowledge, often relevant, from a large volume of data. The authors will show how this combination can improve the process of information retrieval.


Sign in / Sign up

Export Citation Format

Share Document