Topic Difficulty: Collection and Query Formulation Effects

2022 ◽  
Vol 40 (1) ◽  
pp. 1-36
Author(s):  
J. Shane Culpepper ◽  
Guglielmo Faggioli ◽  
Nicola Ferro ◽  
Oren Kurland

Several recent studies have explored the interaction effects between topics, systems, corpora, and components when measuring retrieval effectiveness. However, all of these previous studies assume that a topic or information need is represented by a single query. In reality, users routinely reformulate queries to satisfy an information need. In recent years, there has been renewed interest in the notion of “query variations” which are essentially multiple user formulations for an information need. Like many retrieval models, some queries are highly effective while others are not. This is often an artifact of the collection being searched which might be more or less sensitive to word choice. Users rarely have perfect knowledge about the underlying collection, and so finding queries that work is often a trial-and-error process. In this work, we explore the fundamental problem of system interaction effects between collections, ranking models, and queries. To answer this important question, we formalize the analysis using ANalysis Of VAriance (ANOVA) models to measure multiple components effects across collections and topics by nesting multiple query variations within each topic. Our findings show that query formulations have a comparable effect size of the topic factor itself, which is known to be the factor with the greatest effect size in prior ANOVA studies. Both topic and formulation have a substantially larger effect size than any other factor, including the ranking algorithms and, surprisingly, even query expansion. This finding reinforces the importance of further research in understanding the role of query rewriting in IR related tasks.

Information ◽  
2019 ◽  
Vol 10 (2) ◽  
pp. 39
Author(s):  
Zhenyang Li ◽  
Guangluan Xu ◽  
Xiao Liang ◽  
Feng Li ◽  
Lei Wang ◽  
...  

In recent years, entity-based ranking models have led to exciting breakthroughs in the research of information retrieval. Compared with traditional retrieval models, entity-based representation enables a better understanding of queries and documents. However, the existing entity-based models neglect the importance of entities in a document. This paper attempts to explore the effects of the importance of entities in a document. Specifically, the dataset analysis is conducted which verifies the correlation between the importance of entities in a document and document ranking. Then, this paper enhances two entity-based models—toy model and Explicit Semantic Ranking model (ESR)—by considering the importance of entities. In contrast to the existing models, the enhanced models assign the weights of entities according to their importance. Experimental results show that the enhanced toy model and ESR can outperform the two baselines by as much as 4.57% and 2.74% on NDCG@20 respectively, and further experiments reveal that the strength of the enhanced models is more evident on long queries and the queries where ESR fails, confirming the effectiveness of taking the importance of entities into account.


2016 ◽  
Vol 42 (6) ◽  
pp. 725-747 ◽  
Author(s):  
Bilel Moulahi ◽  
Lynda Tamine ◽  
Sadok Ben Yahia

With the advent of Web search and the large amount of data published on the Web sphere, a tremendous amount of documents become strongly time-dependent. In this respect, the time dimension has been extensively exploited as a highly important relevance criterion to improve the retrieval effectiveness of document ranking models. Thus, a compelling research interest is going on the temporal information retrieval realm, which gives rise to several temporal search applications. In this article, we intend to provide a scrutinizing overview of time-aware information retrieval models. We specifically put the focus on the use of timeliness and its impact on the global value of relevance as well as on the retrieval effectiveness. First, we attempt to motivate the importance of temporal signals, whenever combined with other relevance features, in accounting for document relevance. Then, we review the relevant studies standing at the crossroads of both information retrieval and time according to three common information retrieval aspects: the query level, the document content level and the document ranking model level. We organize the related temporal-based approaches around specific information retrieval tasks and regarding the task at hand, we emphasize the importance of results presentation and particularly timelines to the end user. We also report a set of relevant research trends and avenues that can be explored in the future.


2018 ◽  
Vol 14 (2) ◽  
pp. 138-161
Author(s):  
Liu Jie ◽  
Yuan Kerou ◽  
Zhou Jianshe ◽  
Shi Jinsheng

This article describes how more knowledge appears on the Internet than in an ontological form. Displaying results to users precisely when searching is the key issue of the research on ontology retrieval. The considered factors of ontology ranking are not only limited to internal character-matching, but analysis of metadata, including the entities, structures and the relations in ontologies. Currently, existing single feature ranking algorithms focus on the structures, elements and the contents of a certain aspect in ontology, thus, the results are not satisfactory. Combining multiple single-featured models seems to achieve better results, but the objectivity and versatility of models' weights are debatable. Machine learning effectively solves the problem and putting advantages of ranking learning algorithms together is the pressing issue. So we propose ensemble learning strategies to combine different algorithms in ontology ranking. And the ranking result is more satisfied compared to Swoogle and base algorithms.


2015 ◽  
Vol 21 (1) ◽  
pp. 45-53 ◽  
Author(s):  
Filipe Manuel Clemente ◽  
Fernando Manuel Lourenço Martins ◽  
Rui Sousa Mendes ◽  
Francisco Campos

The aim of this study was to inspect the effects of format and task conditions on neutral players' heart rate responses and time-motion characteristics. Four formats of play using neutral players and three task conditions were inspected. Moreover, the factor repetition (3 games per each SSG) was also analysed. Ten male amateur soccer players (26.36 ± 5.33 years old, 8 ± 3.2 years of practice, 66.18 ± 10.16 bpm at rest) participated in this study. The repeated measured revealed that no differences were found between repetitions (Pillai's Trace = .075; F8, 100 = 1.007; p-value = .436; = .075; Power = .445; small effect size). In the game 1 significant interaction effects between the two factors on heart rate responses and time-motion profiles were observed (Pillai's Trace = 0.699; F24,428 = 3.774; p-value = .001; = .175; Power = 1.000; moderate effect size). In the game 2 , significant interaction effects between the two factors on heart rate responses and time-motion profiles were observed (Pillai's Trace = .712; F24,428 = 3.860; p-value = .001; = .178; Power = 1.000; moderate effect size). Finally, in the game 3 significant interaction effects between the two factors on heart rate responses and time-motion profiles were observed (Pillai's Trace = .729; F24,428 = 3.972; p-value = .001; = .182; Power = 1.000; moderate effect size). Briefly, it was possible to conclude that the biggest formats statistically increased the heart rate responses and time-motion characteristics of neutral players. It was also possible to observe that the mean values of heart rate responses found in neutral players throughout small-sided games were appropriated to very light or recovery workouts.


2016 ◽  
Vol 43 (5) ◽  
pp. 665-682 ◽  
Author(s):  
Luis M. de Campos ◽  
Juan M. Fernández-Luna ◽  
Juan F. Huete

In the context of e-government and more specifically that of parliament, this paper tackles the problem of finding Members of Parliament (MPs) according to their profiles which have been built from their speeches in plenary or committee sessions. The paper presents a common solution for two problems: firstly, a member of the public who is concerned about a certain issue might want to know who the best MP is for dealing with their problem (recommending task); and secondly, each new piece of textual information that reaches the house must be correctly allocated to the appropriate MP according to its content (filtering task). This paper explores both these ways of searching for relevant people conceptually by encapsulating them into a single problem: that of searching for the relevant MP’s profile given an information need. Our research work proposes various profile construction methods (by selecting and weighting appropriate terms) and compares these using different retrieval models to evaluate their quality and suitability for different types of information needs in order to simulate real and common situations.


2021 ◽  
Author(s):  
Esther Heid ◽  
Jiannan Liu ◽  
Andrea Aude ◽  
William H. Green

Heuristic and machine learning models for rank-ordering reaction templates comprise an important basis for computer-aided organic synthesis regarding both product prediction and retrosynthetic pathway planning. Their viability relies heavily on the quality and characteristics of the underlying template database. With the advent of automated reaction and template extraction software and consequently the creation of template databases too large to be curated manually, a data-driven approach to assess and improve the quality of template sets is needed. We therefore systematically studied the influence of template generality, canonicalization and exclusivity on the performance of different template ranking models. We find that duplicate and non-exclusive templates, \textit{i.e.} templates which describe the same chemical transformation on identical or overlapping sets of molecules, decrease both the accuracy of the ranking algorithm and the applicability of the respective top-ranked templates significantly. To remedy the negative effects of non-exclusivity, we developed a general and computationally efficient framework to deduplicate and hierarchically correct templates. As a result, performance improved for both heuristic and machine learning template ranking algorithms across different template sizes. The canonicalization and correction code was made freely available.


Author(s):  
M. C. Whitehead

A fundamental problem in taste research is to determine how gustatory signals are processed and disseminated in the mammalian central nervous system. An important first step toward understanding information processing is the identification of cell types in the nucleus of the solitary tract (NST) and their synaptic relationships with oral primary afferent terminals. Facial and glossopharyngeal (LIX) terminals in the hamster were labelled with HRP, examined with EM, and characterized as containing moderate concentrations of medium-sized round vesicles, and engaging in asymmetrical synaptic junctions. Ultrastructurally the endings resemble excitatory synapses in other brain regions.Labelled facial afferent endings in the RC subdivision synapse almost exclusively with distal dendrites and dendritic spines of NST cells. Most synaptic relationships between the facial synapses and the dendrites are simple. However, 40% of facial endings engage in complex synaptic relationships within glomeruli containing unlabelled axon endings particularly ones termed "SP" endings. SP endings are densely packed with small, pleomorphic vesicles and synapse with both the facial endings and their postsynaptic dendrites by means of nearly symmetrical junctions.


Author(s):  
N. D. Evans ◽  
M. K. Kundmann

Post-column energy-filtered transmission electron microscopy (EFTEM) is inherently challenging as it requires the researcher to setup, align, and control both the microscope and the energy-filter. The software behind an EFTEM system is therefore critical to efficient, day-to-day application of this technique. This is particularly the case in a multiple-user environment such as at the Shared Research Equipment (SHaRE) User Facility at Oak Ridge National Laboratory. Here, visiting researchers, who may oe unfamiliar with the details of EFTEM, need to accomplish as much as possible in a relatively short period of time.We describe here our work in extending the base software of a commercially available EFTEM system in order to automate and streamline particular EFTEM tasks. The EFTEM system used is a Philips CM30 fitted with a Gatan Imaging Filter (GIF). The base software supplied with this system consists primarily of two Macintosh programs and a collection of add-ons (plug-ins) which provide instrument control, imaging, and data analysis facilities needed to perform EFTEM.


Author(s):  
Antonia M. Milroy

In recent years many new techniques and instruments for 3-Dimensional visualization of electron microscopic images have become available. Higher accelerating voltage through thicker sections, photographed at a tilt for stereo viewing, or the use of confocal microscopy, help to analyze biological material without the necessity of serial sectioning. However, when determining the presence of neurotransmitter receptors or biochemical substances present within the nervous system, the need for good serial sectioning (Fig. 1+2) remains. The advent of computer assisted reconstruction and the possibility of feeding information from the specimen viewing chamber directly into a computer via a camera mounted on the electron microscope column, facilitates serial analysis. Detailed information observed at the subcellular level is more precise and extensive and the complexities of interactions within the nervous system can be further elucidated.We emphasize that serial ultra thin sectioning can be performed routinely and consistently in multiple user electron microscopy laboratories. Initial tissue fixation and embedding must be of high quality.


Author(s):  
H. Q. Ye ◽  
T.S. Xie ◽  
D. Li

The Ti3Al intermetallic compound has long been recognized as potentially useful structural materials. It offers attractive strength to weight and elastic modulus to weight ratios. Recent work has established that the addition of Nb to Ti3Al ductilized this compound. In this work the fundamental problem of this alloy, i.e. order-disorder and antiphase domain structures was investigated at the atomic scale.The Ti3Al+10at%Nb alloys used in this study were treated at 1060°C and then aged at 700°C for 2 hours. The specimens suitable for TEM were prepared by standard jet electrolytic-polishing technique. A JEM-200CX electron microscope with an interpretable resolution of about 0.25 nm was used for HREM.The [100] and [001] projections of the α2 phase were shown in Fig.l.The alloy obtained consist of at least two phases-α2(Ti3Al) and β0 structures. Moreover, a disorder α phase with small volume fraction was also observed. Fig.2 gives [100] and [001] diffraction patterns of the α2 phase. Since lattice parameters of the ordered α2 (a=0.579, c=0.466 nm) and disorder α phase (a0=0.294≈a/2, c0=0.468 nm) are almost the same, their diffraction patterns are difficult to be distinguished when they are overlapped with epitaxial orientation relationships.


Sign in / Sign up

Export Citation Format

Share Document