Multi-interest Diversification for End-to-end Sequential Recommendation

2022 ◽  
Vol 40 (1) ◽  
pp. 1-30
Author(s):  
Wanyu Chen ◽  
Pengjie Ren ◽  
Fei Cai ◽  
Fei Sun ◽  
Maarten De Rijke

Sequential recommenders capture dynamic aspects of users’ interests by modeling sequential behavior. Previous studies on sequential recommendations mostly aim to identify users’ main recent interests to optimize the recommendation accuracy; they often neglect the fact that users display multiple interests over extended periods of time, which could be used to improve the diversity of lists of recommended items. Existing work related to diversified recommendation typically assumes that users’ preferences are static and depend on post-processing the candidate list of recommended items. However, those conditions are not suitable when applied to sequential recommendations. We tackle sequential recommendation as a list generation process and propose a unified approach to take accuracy as well as diversity into consideration, called multi-interest, diversified, sequential recommendation . Particularly, an implicit interest mining module is first used to mine users’ multiple interests, which are reflected in users’ sequential behavior. Then an interest-aware, diversity promoting decoder is designed to produce recommendations that cover those interests. For training, we introduce an interest-aware, diversity promoting loss function that can supervise the model to learn to recommend accurate as well as diversified items. We conduct comprehensive experiments on four public datasets and the results show that our proposal outperforms state-of-the-art methods regarding diversity while producing comparable or better accuracy for sequential recommendation.

Information ◽  
2022 ◽  
Vol 13 (1) ◽  
pp. 32
Author(s):  
Gang Sun ◽  
Hancheng Yu ◽  
Xiangtao Jiang ◽  
Mingkui Feng

Edge detection is one of the fundamental computer vision tasks. Recent methods for edge detection based on a convolutional neural network (CNN) typically employ the weighted cross-entropy loss. Their predicted results being thick and needing post-processing before calculating the optimal dataset scale (ODS) F-measure for evaluation. To achieve end-to-end training, we propose a non-maximum suppression layer (NMS) to obtain sharp boundaries without the need for post-processing. The ODS F-measure can be calculated based on these sharp boundaries. So, the ODS F-measure loss function is proposed to train the network. Besides, we propose an adaptive multi-level feature pyramid network (AFPN) to better fuse different levels of features. Furthermore, to enrich multi-scale features learned by AFPN, we introduce a pyramid context module (PCM) that includes dilated convolution to extract multi-scale features. Experimental results indicate that the proposed AFPN achieves state-of-the-art performance on the BSDS500 dataset (ODS F-score of 0.837) and the NYUDv2 dataset (ODS F-score of 0.780).


Electronics ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 275
Author(s):  
Ziyun Jiao ◽  
Fuji Ren

Generative adversarial networks (GANs) were first proposed in 2014, and have been widely used in computer vision, such as for image generation and other tasks. However, the GANs used for text generation have made slow progress. One of the reasons is that the discriminator’s guidance for the generator is too weak, which means that the generator can only get a “true or false” probability in return. Compared with the current loss function, the Wasserstein distance can provide more information to the generator, but RelGAN does not work well with Wasserstein distance in experiments. In this paper, we propose an improved neural network based on RelGAN and Wasserstein loss named WRGAN. Differently from RelGAN, we modified the discriminator network structure with 1D convolution of multiple different kernel sizes. Correspondingly, we also changed the loss function of the network with a gradient penalty Wasserstein loss. Our experiments on multiple public datasets show that WRGAN outperforms most of the existing state-of-the-art methods, and the Bilingual Evaluation Understudy(BLEU) scores are improved with our novel method.


2021 ◽  
Vol 16 (1) ◽  
pp. 1-23
Author(s):  
Min-Ling Zhang ◽  
Jun-Peng Fang ◽  
Yi-Bo Wang

In multi-label classification, the task is to induce predictive models which can assign a set of relevant labels for the unseen instance. The strategy of label-specific features has been widely employed in learning from multi-label examples, where the classification model for predicting the relevancy of each class label is induced based on its tailored features rather than the original features. Existing approaches work by generating a group of tailored features for each class label independently, where label correlations are not fully considered in the label-specific features generation process. In this article, we extend existing strategy by proposing a simple yet effective approach based on BiLabel-specific features. Specifically, a group of tailored features is generated for a pair of class labels with heuristic prototype selection and embedding. Thereafter, predictions of classifiers induced by BiLabel-specific features are ensembled to determine the relevancy of each class label for unseen instance. To thoroughly evaluate the BiLabel-specific features strategy, extensive experiments are conducted over a total of 35 benchmark datasets. Comparative studies against state-of-the-art label-specific features techniques clearly validate the superiority of utilizing BiLabel-specific features to yield stronger generalization performance for multi-label classification.


Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4233
Author(s):  
Bogdan Mocanu ◽  
Ruxandra Tapu ◽  
Titus Zaharia

Emotion is a form of high-level paralinguistic information that is intrinsically conveyed by human speech. Automatic speech emotion recognition is an essential challenge for various applications; including mental disease diagnosis; audio surveillance; human behavior understanding; e-learning and human–machine/robot interaction. In this paper, we introduce a novel speech emotion recognition method, based on the Squeeze and Excitation ResNet (SE-ResNet) model and fed with spectrogram inputs. In order to overcome the limitations of the state-of-the-art techniques, which fail in providing a robust feature representation at the utterance level, the CNN architecture is extended with a trainable discriminative GhostVLAD clustering layer that aggregates the audio features into compact, single-utterance vector representation. In addition, an end-to-end neural embedding approach is introduced, based on an emotionally constrained triplet loss function. The loss function integrates the relations between the various emotional patterns and thus improves the latent space data representation. The proposed methodology achieves 83.35% and 64.92% global accuracy rates on the RAVDESS and CREMA-D publicly available datasets, respectively. When compared with the results provided by human observers, the gains in global accuracy scores are superior to 24%. Finally, the objective comparative evaluation with state-of-the-art techniques demonstrates accuracy gains of more than 3%.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 808
Author(s):  
Mattia Pesenti ◽  
Alberto Antonietti ◽  
Marta Gandolla ◽  
Alessandra Pedrocchi

While the research interest for exoskeletons has been rising in the last decades, missing standards for their rigorous evaluation are potentially limiting their adoption in the industrial field. In this context, exoskeletons for worker support have the aim to reduce the physical effort required by humans, with dramatic social and economic impact. Indeed, exoskeletons can reduce the occurrence and the entity of work-related musculoskeletal disorders that often cause absence from work, resulting in an eventual productivity loss. This very urgent and multifaceted issue is starting to be acknowledged by researchers. This article provides a systematic review of the state of the art for functional performance evaluation of low-back exoskeletons for industrial workers. We report the state-of-the-art evaluation criteria and metrics used for such a purpose, highlighting the lack of a standard for this practice. Very few studies carried out a rigorous evaluation of the assistance provided by the device. To address also this topic, the article ends with a proposed framework for the functional validation of low-back exoskeletons for the industry, with the aim to pave the way for the definition of rigorous industrial standards.


2021 ◽  
Vol 11 (15) ◽  
pp. 7046
Author(s):  
Jorge Francisco Ciprián-Sánchez ◽  
Gilberto Ochoa-Ruiz ◽  
Lucile Rossi ◽  
Frédéric Morandini

Wildfires stand as one of the most relevant natural disasters worldwide, particularly more so due to the effect of climate change and its impact on various societal and environmental levels. In this regard, a significant amount of research has been done in order to address this issue, deploying a wide variety of technologies and following a multi-disciplinary approach. Notably, computer vision has played a fundamental role in this regard. It can be used to extract and combine information from several imaging modalities in regard to fire detection, characterization and wildfire spread forecasting. In recent years, there has been work pertaining to Deep Learning (DL)-based fire segmentation, showing very promising results. However, it is currently unclear whether the architecture of a model, its loss function, or the image type employed (visible, infrared, or fused) has the most impact on the fire segmentation results. In the present work, we evaluate different combinations of state-of-the-art (SOTA) DL architectures, loss functions, and types of images to identify the parameters most relevant to improve the segmentation results. We benchmark them to identify the top-performing ones and compare them to traditional fire segmentation techniques. Finally, we evaluate if the addition of attention modules on the best performing architecture can further improve the segmentation results. To the best of our knowledge, this is the first work that evaluates the impact of the architecture, loss function, and image type in the performance of DL-based wildfire segmentation models.


Author(s):  
Di Wu ◽  
Xiao-Yuan Jing ◽  
Haowen Chen ◽  
Xiaohui Kong ◽  
Jifeng Xuan

Application Programming Interface (API) tutorial is an important API learning resource. To help developers learn APIs, an API tutorial is often split into a number of consecutive units that describe the same topic (i.e. tutorial fragment). We regard a tutorial fragment explaining an API as a relevant fragment of the API. Automatically recommending relevant tutorial fragments can help developers learn how to use an API. However, existing approaches often employ supervised or unsupervised manner to recommend relevant fragments, which suffers from much manual annotation effort or inaccurate recommended results. Furthermore, these approaches only support developers to input exact API names. In practice, developers often do not know which APIs to use so that they are more likely to use natural language to describe API-related questions. In this paper, we propose a novel approach, called Tutorial Fragment Recommendation (TuFraRec), to effectively recommend relevant tutorial fragments for API-related natural language questions, without much manual annotation effort. For an API tutorial, we split it into fragments and extract APIs from each fragment to build API-fragment pairs. Given a question, TuFraRec first generates several clarification APIs that are related to the question. We use clarification APIs and API-fragment pairs to construct candidate API-fragment pairs. Then, we design a semi-supervised metric learning (SML)-based model to find relevant API-fragment pairs from the candidate list, which can work well with a few labeled API-fragment pairs and a large number of unlabeled API-fragment pairs. In this way, the manual effort for labeling the relevance of API-fragment pairs can be reduced. Finally, we sort and recommend relevant API-fragment pairs based on the recommended strategy. We evaluate TuFraRec on 200 API-related natural language questions and two public tutorial datasets (Java and Android). The results demonstrate that on average TuFraRec improves NDCG@5 by 0.06 and 0.09, and improves Mean Reciprocal Rank (MRR) by 0.07 and 0.09 on two tutorial datasets as compared with the state-of-the-art approach.


2011 ◽  
Author(s):  
David Fornaro

Finite Element Analysis (FEA) is mature technology that has been in use for several decades as a tool to optimize structures for a wide variety of applications. Its application to composite structures is not new, however the technology for modeling and analyzing the behavior of composite structures continues to evolve on several fronts. This paper provides a review of the current state-of-the-art with regard to composites FEA, with a particular emphasis on applications to yacht structures. Topics covered are divided into three categories: Pre-processing; Postprocessing; and Non-linear Solutions. Pre-processing topics include meshing, ply properties, laminate definitions, element orientations, global ply tracking and load case development. Post-processing topics include principal stresses, failure indices and strength ratios. Nonlinear solution topics include progressive ply failure. Examples are included to highlight the application of advanced finite element analysis methodologies to the optimization of composite yacht structures.


2020 ◽  
Vol 17 (3) ◽  
pp. 849-865
Author(s):  
Zhongqin Bi ◽  
Shuming Dou ◽  
Zhe Liu ◽  
Yongbin Li

Neural network methods have been trained to satisfactorily learn user/product representations from textual reviews. A representation can be considered as a multiaspect attention weight vector. However, in several existing methods, it is assumed that the user representation remains unchanged even when the user interacts with products having diverse characteristics, which leads to inaccurate recommendations. To overcome this limitation, this paper proposes a novel model to capture the varying attention of a user for different products by using a multilayer attention framework. First, two individual hierarchical attention networks are used to encode the users and products to learn the user preferences and product characteristics from review texts. Then, we design an attention network to reflect the adaptive change in the user preferences for each aspect of the targeted product in terms of the rating and review. The results of experiments performed on three public datasets demonstrate that the proposed model notably outperforms the other state-of-the-art baselines, thereby validating the effectiveness of the proposed approach.


Author(s):  
Andrew Cropper ◽  
Sebastijan Dumančic

A major challenge in inductive logic programming (ILP) is learning large programs. We argue that a key limitation of existing systems is that they use entailment to guide the hypothesis search. This approach is limited because entailment is a binary decision: a hypothesis either entails an example or does not, and there is no intermediate position. To address this limitation, we go beyond entailment and use 'example-dependent' loss functions to guide the search, where a hypothesis can partially cover an example. We implement our idea in Brute, a new ILP system which uses best-first search, guided by an example-dependent loss function, to incrementally build programs. Our experiments on three diverse program synthesis domains (robot planning, string transformations, and ASCII art), show that Brute can substantially outperform existing ILP systems, both in terms of predictive accuracies and learning times, and can learn programs 20 times larger than state-of-the-art systems.


Sign in / Sign up

Export Citation Format

Share Document