ranking loss
Recently Published Documents


TOTAL DOCUMENTS

57
(FIVE YEARS 40)

H-INDEX

7
(FIVE YEARS 4)

2022 ◽  
Vol 4 (1) ◽  
Author(s):  
Paul Prasse ◽  
Pascal Iversen ◽  
Matthias Lienhard ◽  
Kristina Thedinga ◽  
Chris Bauer ◽  
...  

ABSTRACT Computational drug sensitivity models have the potential to improve therapeutic outcomes by identifying targeted drug components that are likely to achieve the highest efficacy for a cancer cell line at hand at a therapeutic dose. State of the art drug sensitivity models use regression techniques to predict the inhibitory concentration of a drug for a tumor cell line. This regression objective is not directly aligned with either of these principal goals of drug sensitivity models: We argue that drug sensitivity modeling should be seen as a ranking problem with an optimization criterion that quantifies a drug’s inhibitory capacity for the cancer cell line at hand relative to its toxicity for healthy cells. We derive an extension to the well-established drug sensitivity regression model PaccMann that employs a ranking loss and focuses on the ratio of inhibitory concentration and therapeutic dosage range. We find that the ranking extension significantly enhances the model’s capability to identify the most effective anticancer drugs for unseen tumor cell profiles based in on in-vitro data.


Author(s):  
Xing Xu ◽  
Yifan Wang ◽  
Yixuan He ◽  
Yang Yang ◽  
Alan Hanjalic ◽  
...  

Image-sentence matching is a challenging task in the field of language and vision, which aims at measuring the similarities between images and sentence descriptions. Most existing methods independently map the global features of images and sentences into a common space to calculate the image-sentence similarity. However, the image-sentence similarity obtained by these methods may be coarse as (1) an intermediate common space is introduced to implicitly match the heterogeneous features of images and sentences in a global level, and (2) only the inter-modality relations of images and sentences are captured while the intra-modality relations are ignored. To overcome the limitations, we propose a novel Cross-Modal Hybrid Feature Fusion (CMHF) framework for directly learning the image-sentence similarity by fusing multimodal features with inter- and intra-modality relations incorporated. It can robustly capture the high-level interactions between visual regions in images and words in sentences, where flexible attention mechanisms are utilized to generate effective attention flows within and across the modalities of images and sentences. A structured objective with ranking loss constraint is formed in CMHF to learn the image-sentence similarity based on the fused fine-grained features of different modalities bypassing the usage of intermediate common space. Extensive experiments and comprehensive analysis performed on two widely used datasets—Microsoft COCO and Flickr30K—show the effectiveness of the hybrid feature fusion framework in CMHF, in which the state-of-the-art matching performance is achieved by our proposed CMHF method.


Author(s):  
Xin Tan ◽  
Jiachen Xu ◽  
Zhou Ye ◽  
Jinkun Hao ◽  
Lizhuang Ma
Keyword(s):  

Author(s):  
Chanjal C

Predicting the relevance between two given videos with respect to their visual content is a key component for content-based video recommendation and retrieval. The application is in video recommendation, video annotation, Category or near-duplicate video retrieval, video copy detection and so on. In order to estimate video relevance previous works utilize textual content of videos and lead to poor performance. The proposed method is feature re-learning for video relevance prediction. This work focus on the visual contents to predict the relevance between two videos. A given feature is projected into a new space by an affine transformation. Different from previous works this use a standard triplet ranking loss that optimize the projection process by a novel negative-enhanced triplet ranking loss. In order to generate more training data, propose a data augmentation strategy which works directly on video features. The multi-level augmentation strategy works for video features, which benefits the feature relearning. The proposed augmentation strategy can be flexibly used for frame-level or video-level features. The loss function that consider the absolute similarity of positive pairs and supervise the feature re-learning process and a new formula for video relevance computation.


2021 ◽  
pp. 1-12
Author(s):  
Wang Zhou ◽  
Yujun Yang ◽  
Yajun Du ◽  
Amin Ul Haq

Recent researches indicate that pairwise learning to rank methods could achieve high performance in dealing with data sparsity and long tail distribution in item recommendation, although suffering from problems such as high computational complexity and insufficient samples, which may cause low convergence and inaccuracy. To further improve the performance in computational capability and recommendation accuracy, in this article, a novel deep neural network based recommender architecture referred to as PDLR is proposed, in which the item corpus will be partitioned into two collections of positive instances and negative items respectively, and pairwise comparison will be performed between the positive instances and negative samples to learn the preference degree for each user. With the powerful capability of neural network, PDLR could capture rich interactions between each user and items as well as the intricate relations between items. As a result, PDLR could minimize the ranking loss, and achieve significant improvement in ranking accuracy. In practice, experimental results over four real world datasets also demonstrate the superiority of PDLR in contrast to state-of-the-art recommender approaches, in terms of Rec@N, Prec@N, AUC and NDCG@N.


2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Hai He ◽  
Haibo Yang

Language and vision are the two most essential parts of human intelligence for interpreting the real world around us. How to make connections between language and vision is the key point in current research. Multimodality methods like visual semantic embedding have been widely studied recently, which unify images and corresponding texts into the same feature space. Inspired by the recent development of text data augmentation and a simple but powerful technique proposed called EDA (easy data augmentation), we can expand the information with given data using EDA to improve the performance of models. In this paper, we take advantage of the text data augmentation technique and word embedding initialization for multimodality retrieval. We utilize EDA for text data augmentation, word embedding initialization for text encoder based on recurrent neural networks, and minimizing the gap between the two spaces by triplet ranking loss with hard negative mining. On two Flickr-based datasets, we achieve the same recall with only 60% of the training dataset as the normal training with full available data. Experiment results show the improvement of our proposed model; and, on all datasets in this paper (Flickr8k, Flickr30k, and MS-COCO), our model performs better on image annotation and image retrieval tasks; the experiments also demonstrate that text data augmentation is more suitable for smaller datasets, while word embedding initialization is suitable for larger ones.


2021 ◽  
Vol 71 ◽  
pp. 121-142
Author(s):  
Aleksandra Burashnikova ◽  
Yury Maximov ◽  
Marianne Clausel ◽  
Charlotte Laclau ◽  
Franck Iutzeler ◽  
...  

In this paper, we propose a theoretically supported sequential strategy for training a large-scale Recommender System (RS) over implicit feedback, mainly in the form of clicks. The proposed approach consists in minimizing pairwise ranking loss over blocks of consecutive items constituted by a sequence of non-clicked items followed by a clicked one for each user. We present two variants of this strategy where model parameters are updated using either the momentum method or a gradient-based approach. To prevent updating the parameters for an abnormally high number of clicks over some targeted items (mainly due to bots), we introduce an upper and a lower threshold on the number of updates for each user. These thresholds are estimated over the distribution of the number of blocks in the training set. They affect the decision of RS by shifting the distribution of items that are shown to the users. Furthermore, we provide a convergence analysis of both algorithms and demonstrate their practical efficiency over six large-scale collections with respect to various ranking measures and computational time.


Sign in / Sign up

Export Citation Format

Share Document