ranking algorithms
Recently Published Documents


TOTAL DOCUMENTS

215
(FIVE YEARS 60)

H-INDEX

22
(FIVE YEARS 3)

2022 ◽  
Vol 40 (1) ◽  
pp. 1-36
Author(s):  
J. Shane Culpepper ◽  
Guglielmo Faggioli ◽  
Nicola Ferro ◽  
Oren Kurland

Several recent studies have explored the interaction effects between topics, systems, corpora, and components when measuring retrieval effectiveness. However, all of these previous studies assume that a topic or information need is represented by a single query. In reality, users routinely reformulate queries to satisfy an information need. In recent years, there has been renewed interest in the notion of “query variations” which are essentially multiple user formulations for an information need. Like many retrieval models, some queries are highly effective while others are not. This is often an artifact of the collection being searched which might be more or less sensitive to word choice. Users rarely have perfect knowledge about the underlying collection, and so finding queries that work is often a trial-and-error process. In this work, we explore the fundamental problem of system interaction effects between collections, ranking models, and queries. To answer this important question, we formalize the analysis using ANalysis Of VAriance (ANOVA) models to measure multiple components effects across collections and topics by nesting multiple query variations within each topic. Our findings show that query formulations have a comparable effect size of the topic factor itself, which is known to be the factor with the greatest effect size in prior ANOVA studies. Both topic and formulation have a substantially larger effect size than any other factor, including the ranking algorithms and, surprisingly, even query expansion. This finding reinforces the importance of further research in understanding the role of query rewriting in IR related tasks.


2022 ◽  
Author(s):  
Mahsa Derakhshan ◽  
Negin Golrezaei ◽  
Vahideh Manshadi ◽  
Vahab Mirrokni

On online platforms, consumers face an abundance of options that are displayed in the form of a position ranking. Only products placed in the first few positions are readily accessible to the consumer, and she needs to exert effort to access more options. For such platforms, we develop a two-stage sequential search model where, in the first stage, the consumer sequentially screens positions to observe the preference weight of the products placed in them and forms a consideration set. In the second stage, she observes the additional idiosyncratic utility that she can derive from each product and chooses the highest-utility product within her consideration set. For this model, we first characterize the optimal sequential search policy of a welfare-maximizing consumer. We then study how platforms with different objectives should rank products. We focus on two objectives: (i) maximizing the platform’s market share and (ii) maximizing the consumer’s welfare. Somewhat surprisingly, we show that ranking products in decreasing order of their preference weights does not necessarily maximize market share or consumer welfare. Such a ranking may shorten the consumer’s consideration set due to the externality effect of high-positioned products on low-positioned ones, leading to insufficient screening. We then show that both problems—maximizing market share and maximizing consumer welfare—are NP-complete. We develop novel near-optimal polynomial-time ranking algorithms for each objective. Further, we show that, even though ranking products in decreasing order of their preference weights is suboptimal, such a ranking enjoys strong performance guarantees for both objectives. We complement our theoretical developments with numerical studies using synthetic data, in which we show (1) that heuristic versions of our algorithms that do not rely on model primitives perform well and (2) that our model can be effectively estimated using a maximum likelihood estimator. This paper was accepted by Gabriel Weintraub, revenue management and market analytics.


2022 ◽  
pp. 21-28
Author(s):  
Dijana Oreški ◽  

The ability to generate data has never been as powerful as today when three quintile bytes of data are generated daily. In the field of machine learning, a large number of algorithms have been developed, which can be used for intelligent data analysis and to solve prediction and descriptive problems in different domains. Developed algorithms have different effects on different problems.If one algorithmworks better on one dataset,the same algorithm may work worse on another data set. The reason is that each dataset has different features in terms of local and global characteristics. It is therefore imperative to know intrinsic algorithms behavior on different types of datasets andchoose the right algorithm for the problem solving. To address this problem, this papergives scientific contribution in meta learning field by proposing framework for identifying the specific characteristics of datasets in two domains of social sciences:education and business and develops meta models based on: ranking algorithms, calculating correlation of ranks, developing a multi-criteria model, two-component index and prediction based on machine learning algorithms. Each of the meta models serve as the basis for the development of intelligent system version. Application of such framework should include a comparative analysis of a large number of machine learning algorithms on a large number of datasetsfromsocial sciences.


Information ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 500
Author(s):  
François Fouss ◽  
Elora Fernandes

Providing fair and convenient comparisons between recommendation algorithms—where algorithms could focus on a traditional dimension (accuracy) and/or less traditional ones (e.g., novelty, diversity, serendipity, etc.)—is a key challenge in the recent developments of recommender systems. This paper focuses on novelty and presents a new, closer-to-reality model for evaluating the quality of a recommendation algorithm by reducing the popularity bias inherent in traditional training/test set evaluation frameworks, which are biased by the dominance of popular items and their inherent features. In the suggested model, each interaction has a probability of being included in the test set that randomly depends on a specific feature related to the focused dimension (novelty in this work). The goal of this paper is to reconcile, in terms of evaluation (and therefore comparison), the accuracy and novelty dimensions of recommendation algorithms, leading to a more realistic comparison of their performance. The results obtained from two well-known datasets show the evolution of the behavior of state-of-the-art ranking algorithms when novelty is progressively, and fairly, given more importance in the evaluation procedure, and could lead to potential changes in the decision processes of organizations involving recommender systems.


2021 ◽  
Author(s):  
Jian Zhou ◽  
Lin Feng ◽  
Shenglan Liu ◽  
Jie Yang ◽  
Ning Cai

Abstract The evaluation of scientific article has always been a very challenging task because of the dynamicchange of citation networks. Over the past decades, plenty of studies have been conducted on thistopic. However, most of the current methods do not consider the link weightings between differentnetworks, which might lead to biased article ranking results. To tackle this issue, we develop aweighted P-Rank algorithm based on a heterogeneous scholarly network for article ranking evaluation.In this study, the corresponding link weightings in heterogeneous scholarly network can be updatedby calculating citation relevance, authors’ contribution, and journals’ impact. To further boost theperformance, we also employ the time information of each article as a personalized PageRank vectorto balance the bias to earlier publications in the dynamic citation network. The experiments areconducted on three public datasets (arXiv, Cora, and MAG). The experimental results demonstratedthat weighted P-Rank algorithm significantly outperforms other ranking algorithms on arXiv andMAG datasets, while it achieves competitive performance on Cora dataset. Under different networkconfiguration conditions, it can be found that the best ranking result can be obtained by jointlyutilizing all kinds of weighted information.


2021 ◽  
Author(s):  
Esther Heid ◽  
Jiannan Liu ◽  
Andrea Aude ◽  
William H. Green

Heuristic and machine learning models for rank-ordering reaction templates comprise an important basis for computer-aided organic synthesis regarding both product prediction and retrosynthetic pathway planning. Their viability relies heavily on the quality and characteristics of the underlying template database. With the advent of automated reaction and template extraction software and consequently the creation of template databases too large to be curated manually, a data-driven approach to assess and improve the quality of template sets is needed. We therefore systematically studied the influence of template generality, canonicalization and exclusivity on the performance of different template ranking models. We find that duplicate and non-exclusive templates, \textit{i.e.} templates which describe the same chemical transformation on identical or overlapping sets of molecules, decrease both the accuracy of the ranking algorithm and the applicability of the respective top-ranked templates significantly. To remedy the negative effects of non-exclusivity, we developed a general and computationally efficient framework to deduplicate and hierarchically correct templates. As a result, performance improved for both heuristic and machine learning template ranking algorithms across different template sizes. The canonicalization and correction code was made freely available.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257784
Author(s):  
Rajaneesh K. Gupta ◽  
Enyinna L. Nwachuku ◽  
Benjamin E. Zusman ◽  
Ruchira M. Jha ◽  
Ava M. Puccio

Drug repurposing has the potential to bring existing de-risked drugs for effective intervention in an ongoing pandemic—COVID-19 that has infected over 131 million, with 2.8 million people succumbing to the illness globally (as of April 04, 2021). We have used a novel `gene signature’-based drug repositioning strategy by applying widely accepted gene ranking algorithms to prioritize the FDA approved or under trial drugs. We mined publically available RNA sequencing (RNA-Seq) data using CLC Genomics Workbench 20 (QIAGEN) and identified 283 differentially expressed genes (FDR<0.05, log2FC>1) after a meta-analysis of three independent studies which were based on severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) infection in primary human airway epithelial cells. Ingenuity Pathway Analysis (IPA) revealed that SARS-CoV-2 activated key canonical pathways and gene networks that intricately regulate general anti-viral as well as specific inflammatory pathways. Drug database, extracted from the Metacore and IPA, identified 15 drug targets (with information on COVID-19 pathogenesis) with 46 existing drugs as potential-novel candidates for repurposing for COVID-19 treatment. We found 35 novel drugs that inhibit targets (ALPL, CXCL8, and IL6) already in clinical trials for COVID-19. Also, we found 6 existing drugs against 4 potential anti-COVID-19 targets (CCL20, CSF3, CXCL1, CXCL10) that might have novel anti-COVID-19 indications. Finally, these drug targets were computationally prioritized based on gene ranking algorithms, which revealed CXCL10 as the common and strongest candidate with 2 existing drugs. Furthermore, the list of 283 SARS-CoV-2-associated proteins could be valuable not only as anti-COVID-19 targets but also useful for COVID-19 biomarker development.


Diagnostics ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1767
Author(s):  
Jian-Wen Chen ◽  
Wan-Ju Lin ◽  
Chun-Yuan Lin ◽  
Che-Lun Hung ◽  
Chen-Pang Hou ◽  
...  

Benign prostatic hyperplasia (BPH) is the main cause of lower urinary tract symptoms (LUTS) in aging males. Transurethral resection of the prostate (TURP) surgery is performed by a cystoscope passing through the urethra and scraping off the prostrate piece by piece through a cutting loop. Although TURP is a minimally invasive procedure, bleeding is still the most common complication. Therefore, the evaluation, monitoring, and prevention of interop bleeding during TURP are very important issues. The main idea of this study is to rank bleeding levels during TURP surgery from videos. Generally, to judge bleeding level by human eyes from surgery videos is a difficult task, which requires sufficient experienced urologists. In this study, machine learning-based ranking algorithms are proposed to efficiently evaluate the ranking of blood levels. Based on the visual clarity of the surgical field, the four ranking of blood levels, including score 0: excellent; score 1: acceptable; score 2: slightly bad; and 3: bad, were identified by urologists who have sufficient experience in TURP surgery. The results of extensive experiments show that the revised accuracy can achieve 90, 89, 90, and 91%, respectively. Particularly, the results reveal that the proposed methods were capable of classifying the ranking of bleeding level accurately and efficiently reducing the burden of urologists.


2021 ◽  
Author(s):  
Kedan He

The rapid emergence of novel psychoactive substances (NPS) poses new challenges and requirements for forensic testing/analysis techniques. This paper aims to explore the application of unsupervised clustering of NPS compounds' infrared spectra. Two statistical measures, Pearson and Spearman, were used to quantify the spectral similarity and to generate the affinity matrices for hierarchical clustering. The correspondence of spectral similarity clustering trees to the commonly used structural/pharmacological categorization was evaluated and compared to the clustering generated using 2D/3D molecular fingerprints. Hybrid model feature selections were applied using different filter-based feature ranking algorithms developed for unsupervised clustering tasks. Since Spearman tends to overestimate the spectral similarity based on the overall pattern of the full spectrum, the clustering result shows the highest degree of improvement from having the non-discriminative features removed. The loading plots of the first two principal components (PCs) of the optimal feature subsets confirmed that the most important vibrational bands contributing to the clustering of NPS compounds were selected using NDFS feature selection algorithms.


Sign in / Sign up

Export Citation Format

Share Document