scholarly journals Relevance Assessments for Web Search Evaluation: Should We Randomise or Prioritise the Pooled Documents?

2022 ◽  
Vol 40 (4) ◽  
pp. 1-35
Author(s):  
Tetsuya Sakai ◽  
Sijie Tao ◽  
Zhaohao Zeng

In the context of depth- k pooling for constructing web search test collections, we compare two approaches to ordering pooled documents for relevance assessors: The prioritisation strategy (PRI) used widely at NTCIR, and the simple randomisation strategy (RND). In order to address research questions regarding PRI and RND, we have constructed and released the WWW3E8 dataset, which contains eight independent relevance labels for 32,375 topic-document pairs, i.e., a total of 259,000 labels. Four of the eight relevance labels were obtained from PRI-based pools; the other four were obtained from RND-based pools. Using WWW3E8, we compare PRI and RND in terms of inter-assessor agreement, system ranking agreement, and robustness to new systems that did not contribute to the pools. We also utilise an assessor activity log we obtained as a byproduct of WWW3E8 to compare the two strategies in terms of assessment efficiency. Our main findings are: (a) The presentation order has no substantial impact on assessment efficiency; (b) While the presentation order substantially affects which documents are judged (highly) relevant, the difference between the inter-assessor agreement under the PRI condition and that under the RND condition is of no practical significance; (c) Different system rankings under the PRI condition are substantially more similar to one another than those under the RND condition; and (d) PRI-based relevance assessment files (qrels) are substantially and statistically significantly more robust to new systems than RND-based ones. Finding (d) suggests that PRI helps the assessors identify relevant documents that affect the evaluation of many existing systems, including those that did not contribute to the pools. Hence, if researchers need to evaluate their current IR systems using legacy IR test collections, we recommend the use of those constructed using the PRI approach unless they have a good reason to believe that their systems retrieve relevant documents that are vastly different from the pooled documents. While this robustness of PRI may also mean that the PRI-based pools are biased against future systems that retrieve highly novel relevant documents, one should note that there is no evidence that RND is any better in this respect.

2013 ◽  
Vol 664 ◽  
pp. 94-98
Author(s):  
Guang De Zhang

Following deepened exploration and development in Shengli exploration area, seismic data requirements are also getting higher and higher. However, in recent years the difference of Xiaoqing river on both sides have made us know that the importance of this problem. In view of the above, this task is aimed at quaternary shallow of old river course within Xiaoqing River. Our analysis of lithology and sedimentary characteristics are using static cone penetration test and rock core exploration method, and we want to reappear near surface deposition of old river course within Xiaoqing River. The research is close combined with the exploration demand and theoretical study, so it has important theoretical and practical significance.


2021 ◽  
Vol 10 (9) ◽  
pp. 1802
Author(s):  
Grzegorz Meder ◽  
Paweł Żuchowski ◽  
Wojciech Skura ◽  
Violetta Palacz-Duda ◽  
Milena Świtońska ◽  
...  

Endovascular treatment is a rapidly evolving technique; therefore, there is a constant need to evaluate this method and its modifications. This paper discusses a single-center experience and the results of switching from the stent retriever only (SO) mechanical thrombectomy (MT) to the combined approach (CA), with a stent retriever and aspiration catheters. Methods: The study involved a retrospective analysis of 70 patients undergoing MT with the use of either SO or CA. The primary endpoint was the frequency of perfect reperfusion defined as grade 3 of the modified Thrombolysis in Cerebral Infarction scale (mTICI) after the first pass. The secondary endpoints were the procedure success, defined as mTICI grades 2b-3; time of the procedure; clinical outcome, measured by 90 days’ modified Rankin Scale (mRS) score; Δ NIHSS, defined as the difference between National Institutes of Health Stroke Scale (NIHSS) score at patients’ admission and discharge; and the total number of device passes. Results: Out of the 70 patients included, 33 were treated with SO and 37 with CA. In both groups, a total number of 42 patients received intravenous recombined tissue plasminogen activator (iv-rTPA: 20 patients (60.6%) in the SO group and 22 patients (59.5%) in the CA group (p = 1.000). There was a significant difference between the groups regarding first-pass success rate, with 46% in the CA group and 18% in the SO group, (OR 3.83, 95% CI 1.28 to 11.44, p = 0.016). Complete procedure success tended to be more frequent in the CA group than in the SO group—94.6% vs. 84.8% (OR 3.13, 95% CI 0.56 to 17.34, p = 0.193)—and CA tended to require a lower number of passes than SO (mean 1.76 vs. 2.09 passes per procedure, p = 0.114), yet these differences did not reach statistical significance. Mean duration of the procedure was significantly shorter in the CA group than in the SO group (49 min vs. 64 min, p = 0.017). There was a significant difference in clinical outcomes, with higher Δ NIHSS (9.3 in the CA group vs. 6.7 in the SO group, p = 0.025) after the procedure and 90-day mRS (median 2 in the CA group vs. 4 in the SO group, p = 0.031). Conclusions: Combining stent retrievers with aspiration catheters may offer a beneficial effect on angiographic results and clinical outcomes in stroke patients undergoing endovascular treatment.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Huan-Hua Xu ◽  
Zhen-Hong Jiang ◽  
Cong-Shu Huang ◽  
Yu-Ting Sun ◽  
Long-Long Xu ◽  
...  

Abstract Background OPD and OPD' are the two main active components of Ophiopogon japonicas in Shenmai injection (SMI). Being isomers of each other, they are supposed to have similar pharmacological activities, but the actual situation is complicated. The difference of hemolytic behavior between OPD and OPD' in vivo and in vitro was discovered and reported by our group for the first time. In vitro, only OPD' showed hemolysis reaction, while in vivo, both OPD and OPD' caused hemolysis. In vitro, the primary cause of hemolysis has been confirmed to be related to the difference between physical and chemical properties of OPD and OPD'. In vivo, although there is a possible explanation for this phenomenon, the one is that OPD is bio-transformed into OPD' or its analogues in vivo, the other one is that both OPD and OPD' were metabolized into more activated forms for hemolysis. However, the mechanism of hemolysis in vivo is still unclear, especially the existing literature are still difficult to explain why OPD shows the inconsistent hemolysis behavior in vivo and in vitro. Therefore, the study of hemolysis of OPD and OPD' in vivo is of great practical significance in response to the increase of adverse events of SMI. Methods Aiming at the hemolysis in vivo, this manuscript adopted untargeted metabolomics and lipidomics technology to preliminarily explore the changes of plasma metabolites and lipids of OPD- and OPD'-treated rats. Metabolomics and lipidomics analyses were performed on ultra-high performance liquid chromatography (UPLC) system tandem with different mass spectrometers (MS) and different columns respectively. Multivariate statistical approaches such as principal component analysis (PCA) and orthogonal partial least square-discriminant analysis (OPLS-DA) were applied to screen the differential metabolites and lipids. Results Both OPD and OPD' groups experienced hemolysis, Changes in endogenous differential metabolites and differential lipids, enrichment of differential metabolic pathways, and correlation analysis of differential metabolites and lipids all indicated that the causes of hemolysis by OPD and OPD' were closely related to the interference of phospholipid metabolism. Conclusions This study provided a comprehensive description of metabolomics and lipidomics changes between OPD- and OPD'-treated rats, it would add to the knowledge base of the field, which also provided scientific guidance for the subsequent mechanism research. However, the underlying mechanism require further research.


2012 ◽  
Vol 5 ◽  
pp. 271-276
Author(s):  
Shu Ren Zhang ◽  
Zhong Long Li

The mesh reinforcement technique of polymer mortar wire rope is a new reinforcement technique used more in the domestic fittest reinforcement project recent years. Recently, there is no unified technical standards, the detailed practice is not same in practical work. There are big differences among the reinforcement effects. The key issue of wire rope is whether add prestressed or not. If add, how much should be prestressed? The difference of the actual practice and reinforcement effect reflects the understanding gap polymer mortar wire rope of mesh reinforcement technique action principle of the designer. A correct understanding of polymer mortar wire rope of mesh reinforcement technique the mechanism and the objective analysis strengthening effect and actively explore research in engineering application problems have a practical significance to promote the healthy development of the structure strengthening technology.


2019 ◽  
Vol 47 (5) ◽  
pp. 493-510 ◽  
Author(s):  
Jingran Zhang ◽  
Sevilay Onal ◽  
Rohit Das ◽  
Amanda Helminsky ◽  
Sanchoy Das

Purpose Fast fulfilment is a key performance measure in online retail, and some retailers have achieved faster times by adopting new designs in their order fulfilment infrastructure. This research empirically confirms and quantifies the fulfilment time advantage that Amazon has achieved, relative to other online retailers. The purpose of this paper is to investigate three research questions: what is the overall mean fulfilment time difference between the new logistics designs of Amazon and the alternative designs of other retailers? For each order what is the distribution of the fulfilment time difference? What is the difference in fulfilment time by product category, price and size? Design/methodology/approach This research uses an empirical method to evaluate the fulfilment time performance of consumer orders made through the Amazon website and one or more competing online retailers. For 1,000 different products two fulfilment times, one at Amazon and another at a competing omnichannel retailer, are recorded. The analysis is then focused on the comparison between this paired data. Findings The research confirms that the new logistics methods, including physical facilities, distribution networks and intelligent order processing methods, have resulted in faster order fulfilment times. The performance, though, is not universally dominant and for 33 per cent of orders, the difference is 1 day or less. The fulfilment time difference varied by product, category, price or size. Practical implications The ongoing transformation of fulfilment and logistics operations at online retailers has generated several new research questions. This includes the need to confirm the fulfilment efficiency of the new designs and specify time targets. This paper identifies the fulfilment time gap between new and traditional operations. The results suggest that store-based or distribution centre-based fulfilment strategies may not match the new designs. Originality/value The study provides a quantitative analysis of the fulfilment time differentials in online retailing. The critical role of fulfilment logistics in the rapidly growing online retail industry can now be better modelled and studied. The survey method representing a single buyer allows for order pair equivalency and eliminates order bias. The results suggest that new warehousing and logistics designs can lead to significantly faster fulfilment times.


2007 ◽  
Vol 24 (1) ◽  
pp. 71-73 ◽  
Author(s):  
Harry V. Wiant ◽  
John R. Brooks

Abstract The difference between the use of the arithmetic and geometric means for estimation of average stump diameter, stump cross-sectional area and estimated tree volume was investigated using measurements from 739 stumps from an Appalachian hardwood stand located in central West Virginia. Although average stump diameter, cross-sectional area, and tree volumes were statistically different between estimates based on the arithmetic and geometric mean diameter, these differences were of little practical significance. The difference in average stem diameter, cross-sectional area, tree cubic volume, and board foot volume were 0.05 in, 0.01 ft2, 0.45 ft3, and 2.41 bd ft, respectively.


1984 ◽  
Vol 35 (5) ◽  
pp. 709 ◽  
Author(s):  
JR Donnelly

Weaning percentage and perinatal mortality of lambs born in late winter or early spring to Merino and Border Leicester x Merino ewes grazing at several stocking rates on lucerne or phalaris-clover pastures were measured over 2 years. Weaning percentages for mature crossbred ewes declined linearly from 136 lambs per 100 ewes joined when stocked at 9 ha-1 to 100 for those at 18 ha-1. For mature Merino ewes, the values were 109 and 70 respectively. Weaning percentages were similar on lucerne and phalaris pastures, although 8% more lambs were born to ewes grazing on phalaris; higher mortality in lambs born as multiples eliminated the difference. Death from exposure during the first 3 days of life was the most important cause of lamb losses. For lambs born as singles to Merino ewes the probability of death from exposure was up to 0.4, and reached 0.6 for lambs born as multiples. For single and multiple lambs born to crossbred ewes equivalent probabilities were 0.25 and 0.4 respectively. These probabilities were reduced if maternal weight was high at lambing, the reduction being of practical significance in very cold weather, particularly if the proportion of multiple births was high. Under mild conditions, where the probability of death from exposure was low, reductions in mortality from high ewe weight at lambing were of little consequence. Long-term weather records kept at the experimental site near Canberra show that a high risk of death in new-born lambs is likely from early June to mid-September. Throughout this period deaths from exposure could be expected to exceed 30% in lambs born as multiples to Merino ewes.


2005 ◽  
Vol 31 ◽  
pp. 193-226 ◽  
Author(s):  
Joseph Heath

Critical response to John Rawls's The Law of Peopleshas been surprisingly harsh) Most of the complaints centre on Rawls's claim that there are no obligations of distributive justice among nations. Many of Rawls's critics evidently had been hoping for a global application of the difference principle, so that wealthier nations would be bound to assign lexical priority to the development of the poorest nations, or perhaps the primary goods endowment of the poorest citizens of any nation. Their subsequent disappointment reveals that, while the reception of Rawls's political philosophy has been very broad, it has not been especially deep. Rawls has very good reason for denying that there are obligations of distributive justice in an international context.


1995 ◽  
Vol 22 ◽  
pp. 369-408 ◽  
Author(s):  
Jan Vansina

The historian of pre-nineteenth century Africa…cannot get far without the aid of archaeology.Nevertheless, historians have good reason to be cautious about historical generalisations by archaeologists and about their own use of archaeological material…: it would be a rash historian who totally accepted the conclusions of Garlake and Huffman with the same simple-minded trust as I myself accepted the conclusions of Summers and Robinson.In the beginning, historians of Africa put great store by archeology. Was its great time depth not one of the distinctive features of the history of Africa, a condition that cannot be put aside without seriously distorting the flavor of all its history? Did not the relative scarcity and the foreign authorship of most precolonial written records render archeological sources all the more precious? Did not history and archeology both deal with the reconstruction of human societies in the past? Was the difference between them not merely the result of a division of labor based on sources, so that historical reconstruction follows in time and flows from archeological reconstruction? Such considerations explain why the Journal of African History has regularly published regional archeological surveys in order to keep historians up to date.


2012 ◽  
Vol 30 (5_suppl) ◽  
pp. 41-41 ◽  
Author(s):  
Daniella J. Perlroth ◽  
Stephen F. Thompson ◽  
Yesenia Luna ◽  
Dana P. Goldman ◽  
Essy Mozaffari ◽  
...  

41 Background: ADT and chemotherapy use in men with mPC may differ across regions in community practice. The extent of variation could indicate whether men with mPC have appropriate access to effective treatments. Methods: We identified 16,024 men diagnosed with mPC in the Surveillance, Epidemiology, and End Results (SEER) database from 2000-2005 linked to their Medicare claims. Patients were excluded if they had a second cancer or disenrolled from Medicare Parts A or B (n=6,155), or failed to initiate therapy with ADT (n=3,400). We identified demographic and clinical information from SEER and treatments and comorbidities from J-codes and ICD-9 codes in the Medicare claims. We used regression models to estimate the probability of advancement to chemotherapy, the time from diagnosis to first ADT use, and time from first ADT to chemotherapy. Then the patient-level predicted results from these models were used to generate summary statistics by hospital service area (HSA). Results: There were 6,469 patients remaining after exclusion who were treated with ADT, and 1,198 of those received chemotherapy (19%). The median age was 76 years old, most were white (77%), married (62%), and 50% had 1 other major comorbidity (most frequent was diabetes, 21%). Men who were younger, married, with fewer comorbidities, and higher Gleason scores were statistically more likely to both receive chemotherapy and use it earlier. After adjusting for clinical and sociodemographic factors, the average time to ADT by referral region was 2.7 months but varied from 1.3 to 5.6; probability of progression to chemotherapy averaged 19% but varied from 6% to 30%, and the time from first ADT to chemotherapy averaged 19.7 months but varied from 12.9 to 25.7 months. The difference in time to ADT between regions in the 10th and 90th percentiles of use was 2.6 months, whereas for chemotherapy initiation, it was 12.4 months. Conclusions: Our results suggest that living in different parts of the country has a substantial impact on how clinically similar patients are treated. There was substantial variation across regions in use of and time to initiation of chemotherapy for men with mPC, but not in ADT use.


Sign in / Sign up

Export Citation Format

Share Document