Finding Information Faster by Tracing My Colleagues' Trails

Knowledge workers are confronted with the challenge of efficient information retrieval in enterprises, which is one of the most important barriers to knowledge reuse. This problem has been intensified in recent years by several organizational developments such as increasing data volume and number of data sources. In this chapter, a reference algorithm for enterprise search is developed that integrates aspects from personalized, social, collaborative, and dynamic search to consider the different natures and requirements of enterprise and web search. Because of the modular structure of the algorithm, it can easily be adapted by enterprises to their specificities by concretization. The components that can be configured during the adaptation process are discussed. Furthermore, the performance of a typical instance of the algorithm is investigated through a laboratory experiment. This instance is found to outperform rather traditional approaches to enterprise search.

Download Full-text

Genetic-Fuzzy Programming Based Linkage Rule Miner (GFPLR-Miner) for Entity Linking in Semantic Web

Research Anthology on Multi-Industry Uses of Genetic Programming and Algorithms ◽

10.4018/978-1-7998-8048-6.ch023 ◽

2021 ◽

pp. 447-481

Author(s):

Amit Singh ◽

Aditi Sharan

Keyword(s):

Fuzzy Logic ◽

Information Retrieval ◽

Semantic Web ◽

Real World ◽

Fuzzy Programming ◽

Data Sources ◽

Fuzzy Approach ◽

Entity Linking ◽

Efficient Information ◽

Domain Independent

This article describes how semantic web data sources follow linked data principles to facilitate efficient information retrieval and knowledge sharing. These data sources may provide complementary, overlapping or contradicting information. In order to integrate these data sources, the authors perform entity linking. Entity linking is an important task of identifying and linking entities across data sources that refer to the same real-world entities. In this work, they have proposed a genetic fuzzy approach to learn linkage rules for entity linking. This method is domain independent, automatic and scalable. Their approach uses fuzzy logic to adapt mutation and crossover rates of genetic programming to ensure guided convergence. The authors' experimental evaluation demonstrates that our approach is competitive and make significant improvements over state of the art methods.

Download Full-text

A New Stemming Algorithm for Efficient Information Retrieval Systems and Web Search Engines

Intelligent Systems Reference Library - Multimedia Forensics and Security ◽

10.1007/978-3-319-44270-9_6 ◽

2016 ◽

pp. 117-135 ◽

Cited By ~ 3

Author(s):

Safaa I. Hajeer ◽

Rasha M. Ismail ◽

Nagwa L. Badr ◽

Mohamed Fahmy Tolba

Keyword(s):

Information Retrieval ◽

Search Engines ◽

Web Search ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Efficient Information ◽

Web Search Engines

Download Full-text

Neural methods for effective, efficient, and exposure-aware information retrieval

ACM SIGIR Forum ◽

10.1145/3476415.3476434 ◽

2021 ◽

Vol 55 (1) ◽

pp. 1-2

Author(s):

Bhaskar Mitra

Keyword(s):

Information Retrieval ◽

Language Processing ◽

Large Scale ◽

Web Search ◽

Real Life ◽

Inverted Index ◽

Information Need ◽

Product Model ◽

Performance Improvements ◽

Deep Model

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents---or short passages---in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms---such as a person's name or a product model number---not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections---such as the document index of a commercial Web search engine---containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks. We ground our contributions with a detailed survey of the growing body of neural IR literature [Mitra and Craswell, 2018]. Our key contribution towards improving the effectiveness of deep ranking models is developing the Duet principle [Mitra et al., 2017] which emphasizes the importance of incorporating evidence based on both patterns of exact term matches and similarities between learned latent representations of query and document. To efficiently retrieve from large collections, we develop a framework to incorporate query term independence [Mitra et al., 2019] into any arbitrary deep model that enables large-scale precomputation and the use of inverted index for fast retrieval. In the context of stochastic ranking, we further develop optimization strategies for exposure-based objectives [Diaz et al., 2020]. Finally, this dissertation also summarizes our contributions towards benchmarking neural IR models in the presence of large training datasets [Craswell et al., 2019] and explores the application of neural methods to other IR tasks, such as query auto-completion.

Download Full-text

Maximum Variance Hashing via Column Generation

Mathematical Problems in Engineering ◽

10.1155/2013/379718 ◽

2013 ◽

Vol 2013 ◽

pp. 1-10

Author(s):

Lei Luo ◽

Chao Zhang ◽

Yongrui Qin ◽

Chunyuan Zhang

Keyword(s):

Column Generation ◽

Large Scale ◽

Web Search ◽

Nearest Neighbor ◽

Computational Cost ◽

Multimedia Retrieval ◽

Training Data ◽

Nonlinear Dimensionality Reduction ◽

Maximum Variance ◽

Data Volume

With the explosive growth of the data volume in modern applications such as web search and multimedia retrieval, hashing is becoming increasingly important for efficient nearest neighbor (similar item) search. Recently, a number of data-dependent methods have been developed, reflecting the great potential of learning for hashing. Inspired by the classic nonlinear dimensionality reduction algorithm—maximum variance unfolding, we propose a novel unsupervised hashing method, named maximum variance hashing, in this work. The idea is to maximize the total variance of the hash codes while preserving the local structure of the training data. To solve the derived optimization problem, we propose a column generation algorithm, which directly learns the binary-valued hash functions. We then extend it using anchor graphs to reduce the computational cost. Experiments on large-scale image datasets demonstrate that the proposed method outperforms state-of-the-art hashing methods in many cases.

Download Full-text

Query Recommendation Using Hybrid Query Relevance

Future Internet ◽

10.3390/fi10110112 ◽

2018 ◽

Vol 10 (11) ◽

pp. 112

Author(s):

Jialu Xu ◽

Feiyue Ye

Keyword(s):

Information Retrieval ◽

Information Search ◽

Search Engines ◽

Web Search ◽

Superior Performance ◽

Recommendation Algorithm ◽

Web Information ◽

Query Recommendation

With the explosion of web information, search engines have become main tools in information retrieval. However, most queries submitted in web search are ambiguous and multifaceted. Understanding the queries and mining query intention is critical for search engines. In this paper, we present a novel query recommendation algorithm by combining query information and URL information which can get wide and accurate query relevance. The calculation of query relevance is based on query information by query co-concurrence and query embedding vector. Adding the ranking to query-URL pairs can calculate the strength between query and URL more precisely. Empirical experiments are performed based on AOL log. The results demonstrate the effectiveness of our proposed query recommendation algorithm, which achieves superior performance compared to other algorithms.

Download Full-text

HISTORY OF INDIAN TRADITIONAL MEDICINE: A MEDICAL INHERITANCE.

Asian Journal of Pharmaceutical and Clinical Research ◽

10.22159/ajpcr.2018.v11i1.21893 ◽

2018 ◽

Vol 11 (1) ◽

pp. 421 ◽

Cited By ~ 2

Author(s):

Partha Pradip Adhikari ◽

Satya Bhusan Paul

Keyword(s):

Traditional Medicine ◽

Web Search ◽

Care Service ◽

Health Care Service ◽

Modern Medicine ◽

Traditional Medicines ◽

Regional Effects ◽

Practice Of Medicine ◽

History Of ◽

Traditional Approaches

Objective: Indian Traditional Medicine, the foundation of age-old practice of medicine in the world, has played an essential role in human health care service and welfare from its inception. Likewise, all traditional medicines are of its own regional effects and dominant in the West Asian nations; India, Pakistan, Tibet, and so forth, East Asian nations; China, Korea, Japan, Vietnam, and so forth, Africa, South and Central America. This article is an attempt to illuminate Indian traditional medical service and its importance, based on recent methodical reviews.Methods: Web search engines for example; Google, Science Direct and Google Scholar were employed for reviews as well as for meta-analysis.Results: There is a long running debate between individuals, who utilize Indian Traditional Medicines for different ailments and disorders, and the individuals who depend on the present day; modern medicine for cure. The civil argument between modern medicine and traditional medicines comes down to a basic truth; each person, regardless of education or sickness, ought to be educated about the actualities concerning their illness and the associated side effects of medicines. Therapeutic knowledge of Indian traditional medicine has propelled various traditional approaches with similar or different theories and methodologies, which are of regional significance.Conclusion: To extend research exercises on Indian Traditional Medicine, in near future, and to explore the phytochemicals; the current review will help the investigators involved in traditional medicinal pursuit.

Download Full-text

Using Traces to Investigate Self-Regulatory Activities: A Study of Self-Regulation and Achievement Goal Profiles in the Context of Web Search for Academic Tasks

Journal of Cognitive Education and Psychology ◽

10.1891/1945-8959.12.3.287 ◽

2013 ◽

Vol 12 (3) ◽

pp. 287-305 ◽

Cited By ~ 11

Author(s):

Mingming Zhou

Keyword(s):

Web Search ◽

Self Regulation ◽

Study Strategies ◽

Academic Tasks ◽

Self Regulated Learning ◽

Chinese University ◽

Learning Contexts ◽

Mixed Pattern ◽

Traditional Approaches ◽

Specific Learning

Traditional approaches of researching self-regulated learning (SRL) fail to capture how learners actually employ studying tactics, how tactics are strategically adapted to specific learning contexts, and how learners adapt tactics and interweave them to form an efficient strategy. Computer traces can capture SRL “on the fly,” and enable researchers to track learning events in a nonlinear environment without disrupting the learner’s thinking or navigation through content. More importantly, data obtained in real time allow “virtual” re-creation of learners’ actions during studying. There were 107 Chinese university students’ traces collected while they solved assigned problems through searching the web. By linking their regulatory activities during online search to their goal profiles, results showed that mastery-approach-dominant students were most strategic, whereas performance-avoidance-dominant students were least. Moderately motivated students showed a mixed pattern of deep and surface study strategies. Implications of the findings were also discussed.

Download Full-text

Information Retrieval and Web Search

Handbook of Linear Algebra ◽

10.1201/9781420010572-63 ◽

2006 ◽

pp. 63-1-63-16

Author(s):

Amy N. Langville ◽

Carl D. Meyer

Keyword(s):

Information Retrieval ◽

Web Search

Download Full-text

Information Retrieval

The Oxford Handbook of Computational Linguistics 2nd edition ◽

10.1093/oxfordhb/9780199573691.013.022 ◽

2016 ◽

Author(s):

Qiaozhu Mei ◽

Dragomir Radev

Keyword(s):

Information Retrieval ◽

Digital Libraries ◽

Web Search ◽

Retrieval System ◽

Information Retrieval System ◽

Information Need ◽

System A ◽

Recent Developments ◽

Text Information ◽

Text Information Retrieval

This chapter is a basic introduction to text information retrieval. Information Retrieval (IR) refers to the activities of obtaining information resources (usually in the form of textual documents) from a much larger collection, which are relevant to an information need of the user (usually expressed as a query). Practical instances of an IR system include digital libraries and Web search engines. This chapter presents the typical architecture of an IR system, an overview of the methods corresponding to the design and the implementation of each major component of an information retrieval system, a discussion of evaluation methods for an IR system, and finally a summary of recent developments and research trends in the field of information retrieval.

Download Full-text

Memory versus logic: two models of organizing information and their influences on web retrieval strategies

tripleC Communication Capitalism & Critique Open Access Journal for a Global Sustainable Information Society ◽

10.31269/triplec.v4i2.34 ◽

1970 ◽

Vol 4 (2) ◽

pp. 178-186

Author(s):

Teresa Numerico

Keyword(s):

Information Retrieval ◽

Web Search ◽

Data Representation ◽

Philosophical Tradition ◽

Formal Representation ◽

Mathematical Functions ◽

Von Neumann ◽

Vannevar Bush ◽

Social Topology ◽

Information Retrieval Methods

We can find the first anticipation of the World Wide Web hypertextual structure in Bush paper of 1945, where he described a “selection” and storage machine called the Memex, capable of keeping the useful information of a user and connecting it to other relevant material present in the machine or added by other users. We will argue that Vannevar Bush, who conceived this type of machine, did it because its involvement with analogical devices. During the 1930s, in fact, he invented and built the Differential Analyzer, a powerful analogue machine, used to calculate various relevant mathematical functions. The model of the Memex is not the digital one, because it relies on another form of data representation that emulates more the procedures of memory than the attitude of the logic used by the intellect. Memory seems to select and arrange information according to association strategies, i.e., using analogies and connections that are very often arbitrary, sometimes even chaotic and completely subjective. The organization of information and the knowledge creation process suggested by logic and symbolic formal representation of data is deeply different from the former one, though the logic approach is at the core of the birth of computer science (i.e., the Turing Machine and the Von Neumann Machine). We will discuss the issues raised by these two “visions” of information management and the influences of the philosophical tradition of the theory of knowledge on the hypertextual organization of content. We will also analyze all the consequences of these different attitudes with respect to information retrieval techniques in a hypertextual environment, as the web. Our position is that it necessary to take into accounts the nature and the dynamic social topology of the network when we choose information retrieval methods for the network; otherwise, we risk creating a misleading service for the end user of web search tools (i.e., search engines).

Download Full-text