Learning to balance the coherence and diversity of response generation in generation-based chatbots

Generating response with both coherence and diversity is a challenging task in generation-based chatbots. It is more difficult to improve the coherence and diversity of dialog generation at the same time in the response generation model. In this article, we propose an improved method that improves the coherence and diversity of dialog generation by changing the model to use gamma sampling and adding attention mechanism to the knowledge-guided conditional variational autoencoder. The experimental results demonstrate that our proposed method can significantly improve the coherence and diversity of knowledge-guided conditional variational autoencoder for response generation in generation-based chatbots at the same time.

Download Full-text

A Global-Local Blur Disentangling Network for Dynamic Scene Deblurring

Applied Sciences ◽

10.3390/app11052174 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2174

Author(s):

Xiaoguang Li ◽

Feifan Yang ◽

Jianglu Huang ◽

Li Zhuo

Keyword(s):

Local Features ◽

Attention Mechanism ◽

Experimental Results ◽

Dynamic Scene ◽

Feature Maps ◽

Training Scheme ◽

Real Scene ◽

Global And Local

Images captured in a real scene usually suffer from complex non-uniform degradation, which includes both global and local blurs. It is difficult to handle the complex blur variances by a unified processing model. We propose a global-local blur disentangling network, which can effectively extract global and local blur features via two branches. A phased training scheme is designed to disentangle the global and local blur features, that is the branches are trained with task-specific datasets, respectively. A branch attention mechanism is introduced to dynamically fuse global and local features. Complex blurry images are used to train the attention module and the reconstruction module. The visualized feature maps of different branches indicated that our dual-branch network can decouple the global and local blur features efficiently. Experimental results show that the proposed dual-branch blur disentangling network can improve both the subjective and objective deblurring effects for real captured images.

Download Full-text

Differentiated Attentive Representation Learning for Sentence Classification

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/644 ◽

2018 ◽

Cited By ~ 5

Author(s):

Qianrong Zhou ◽

Xiaojie Wang ◽

Xuan Dong

Keyword(s):

Representation Learning ◽

Learning Model ◽

Attention Mechanism ◽

Experimental Results ◽

Sentence Classification ◽

Synthetic Datasets

Attention-based models have shown to be effective in learning representations for sentence classification. They are typically equipped with multi-hop attention mechanism. However, existing multi-hop models still suffer from the problem of paying much attention to the most frequently noticed words, which might not be important to classify the current sentence. And there is a lack of explicitly effective way that helps the attention to be shifted out of a wrong part in the sentence. In this paper, we alleviate this problem by proposing a differentiated attentive learning model. It is composed of two branches of attention subnets and an example discriminator. An explicit signal with the loss information of the first attention subnet is passed on to the second one to drive them to learn different attentive preference. The example discriminator then selects the suitable attention subnet for sentence classification. Experimental results on real and synthetic datasets demonstrate the effectiveness of our model.

Download Full-text

Person Reidentification Model Based on Multiattention Modules and Multiscale Residuals

Complexity ◽

10.1155/2021/6673461 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Yongyi Li ◽

Shiqi Wang ◽

Shuang Dong ◽

Xueling Lv ◽

Changzhi Lv ◽

...

Keyword(s):

Local Features ◽

Attention Mechanism ◽

Experimental Results ◽

Original Network ◽

Fine Grained ◽

Backbone Network ◽

Model Based ◽

Local Branch ◽

Feature Expression ◽

Global And Local

At present, person reidentification based on attention mechanism has attracted many scholars’ interests. Although attention module can improve the representation ability and reidentification accuracy of Re-ID model to a certain extent, it depends on the coupling of attention module and original network. In this paper, a person reidentification model that combines multiple attentions and multiscale residuals is proposed. The model introduces combined attention fusion module and multiscale residual fusion module in the backbone network ResNet 50 to enhance the feature flow between residual blocks and better fuse multiscale features. Furthermore, a global branch and a local branch are designed and applied to enhance the channel aggregation and position perception ability of the network by utilizing the dual ensemble attention module, as along as the fine-grained feature expression is obtained by using multiproportion block and reorganization. Thus, the global and local features are enhanced. The experimental results on Market-1501 dataset and DukeMTMC-reID dataset show that the indexes of the presented model, especially Rank-1 accuracy, reach 96.20% and 89.59%, respectively, which can be considered as a progress in Re-ID.

Download Full-text

A Novel Vehicle Destination Prediction Model With Expandable Features Using Attention Mechanism and Variational Autoencoder

IEEE Transactions on Intelligent Transportation Systems ◽

10.1109/tits.2021.3137168 ◽

2022 ◽

pp. 1-10

Author(s):

Xiangyang Wu ◽

Weite Zhu ◽

Zhen Liu ◽

Zhen Zhang

Keyword(s):

Prediction Model ◽

Attention Mechanism ◽

Variational Autoencoder

Download Full-text

A short text conversation generation model combining BERT and context attention mechanism

International Journal of Computational Science and Engineering ◽

10.1504/ijcse.2020.10032809 ◽

2020 ◽

Vol 23 (2) ◽

pp. 136

Author(s):

Jie Cao ◽

Huan Zhao ◽

Jian Lu

Keyword(s):

Attention Mechanism ◽

Generation Model ◽

Short Text ◽

Model Combining

Download Full-text

Earlier Attention? Aspect-Aware LSTM for Aspect-Based Sentiment Analysis

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/738 ◽

2019 ◽

Cited By ~ 3

Author(s):

Bowen Xing ◽

Lejian Liao ◽

Dandan Song ◽

Jingang Wang ◽

Fuzheng Zhang ◽

...

Keyword(s):

Sentiment Analysis ◽

Irrelevant Information ◽

Attention Mechanism ◽

Experimental Results ◽

Context Modeling ◽

Fine Grained ◽

Related Information ◽

Modeling Process ◽

Novel Variant

Aspect-based sentiment analysis (ABSA) aims to predict fine-grained sentiments of comments with respect to given aspect terms or categories. In previous ABSA methods, the importance of aspect has been realized and verified. Most existing LSTM-based models take aspect into account via the attention mechanism, where the attention weights are calculated after the context is modeled in the form of contextual vectors. However, aspect-related information may be already discarded and aspect-irrelevant information may be retained in classic LSTM cells in the context modeling process, which can be improved to generate more effective context representations. This paper proposes a novel variant of LSTM, termed as aspect-aware LSTM (AA-LSTM), which incorporates aspect information into LSTM cells in the context modeling stage before the attention mechanism. Therefore, our AA-LSTM can dynamically produce aspect-aware contextual representations. We experiment with several representative LSTM-based models by replacing the classic LSTM cells with the AA-LSTM cells. Experimental results on SemEval-2014 Datasets demonstrate the effectiveness of AA-LSTM.

Download Full-text

Commonsense Knowledge Aware Conversation Generation with Graph Attention

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/643 ◽

2018 ◽

Cited By ~ 35

Author(s):

Hao Zhou ◽

Tom Young ◽

Minlie Huang ◽

Haizhou Zhao ◽

Jingfang Xu ◽

...

Keyword(s):

Language Processing ◽

Large Scale ◽

Semantic Information ◽

Attention Mechanism ◽

Generation Model ◽

Dynamic Graph ◽

Commonsense Knowledge ◽

Word Generation ◽

Proposed Model ◽

Knowledge Graphs

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus supports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism. This is the first attempt that uses large-scale commonsense knowledge in conversation generation. Furthermore, unlike existing models that use knowledge triples (entities) separately and independently, our model treats each knowledge graph as a whole, which encodes more structured, connected semantic information in the graphs. Experiments show that the proposed model can generate more appropriate and informative responses than state-of-the-art baselines.

Download Full-text

Weakly Supervised Disentanglement by Pairwise Similarities

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5754 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3495-3502 ◽

Cited By ~ 1

Author(s):

Junxiang Chen ◽

Kayhan Batmanghelich

Keyword(s):

Real World ◽

Latent Variables ◽

Generative Models ◽

Experimental Results ◽

New Method ◽

Weak Supervision ◽

Real World Problem ◽

Variational Autoencoder ◽

Weakly Supervised

Recently, researches related to unsupervised disentanglement learning with deep generative models have gained substantial popularity. However, without introducing supervision, there is no guarantee that the factors of interest can be successfully recovered (Locatello et al. 2018). Motivated by a real-world problem, we propose a setting where the user introduces weak supervision by providing similarities between instances based on a factor to be disentangled. The similarity is provided as either a binary (yes/no) or real-valued label describing whether a pair of instances are similar or not. We propose a new method for weakly supervised disentanglement of latent variables within the framework of Variational Autoencoder. Experimental results demonstrate that utilizing weak supervision improves the performance of the disentanglement method substantially.

Download Full-text

Non-I.I.D. Multi-Instance Learning for Predicting Instance and Bag Labels with Variational Auto-Encoder

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/465 ◽

2021 ◽

Author(s):

Weijia Zhang

Keyword(s):

Medical Imaging ◽

Supervised Learning ◽

Real World ◽

State Of The Art ◽

Experimental Results ◽

Weakly Supervised Learning ◽

Variational Autoencoder ◽

Label Prediction ◽

Weakly Supervised ◽

Better Than

Multi-instance learning is a type of weakly supervised learning. It deals with tasks where the data is a set of bags and each bag is a set of instances. Only the bag labels are observed whereas the labels for the instances are unknown. An important advantage of multi-instance learning is that by representing objects as a bag of instances, it is able to preserve the inherent dependencies among parts of the objects. Unfortunately, most existing algorithms assume all instances to be identically and independently distributed, which violates real-world scenarios since the instances within a bag are rarely independent. In this work, we propose the Multi-Instance Variational Autoencoder (MIVAE) algorithm which explicitly models the dependencies among the instances for predicting both bag labels and instance labels. Experimental results on several multi-instance benchmarks and end-to-end medical imaging datasets demonstrate that MIVAE performs better than state-of-the-art algorithms for both instance label and bag label prediction tasks.

Download Full-text

Learning Dynamic Factors to Improve the Accuracy of Bus Arrival Time Prediction via a Recurrent Neural Network

Future Internet ◽

10.3390/fi11120247 ◽

2019 ◽

Vol 11 (12) ◽

pp. 247

Author(s):

Xin Zhou ◽

Peixin Dong ◽

Jianping Xing ◽

Peijia Sun

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Public Transportation ◽

Prediction Accuracy ◽

Arrival Time ◽

Attention Mechanism ◽

Experimental Results ◽

Support Vector ◽

Data Set ◽

Dynamic Factors

Accurate prediction of bus arrival times is a challenging problem in the public transportation field. Previous studies have shown that to improve prediction accuracy, more heterogeneous measurements provide better results. So what other factors should be added into the prediction model? Traditional prediction methods mainly use the arrival time and the distance between stations, but do not make full use of dynamic factors such as passenger number, dwell time, bus driving efficiency, etc. We propose a novel approach that takes full advantage of dynamic factors. Our approach is based on a Recurrent Neural Network (RNN). The experimental results indicate that a variety of prediction algorithms (such as Support Vector Machine, Kalman filter, Multilayer Perceptron, and RNN) have significantly improved performance after using dynamic factors. Further, we introduce RNN with an attention mechanism to adaptively select the most relevant input factors. Experiments demonstrate that the prediction accuracy of RNN with an attention mechanism is better than RNN with no attention mechanism when there are heterogeneous input factors. The experimental results show the superior performances of our approach on the data set provided by Jinan Public Transportation Corporation.

Download Full-text