scholarly journals Get The Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism

Author(s):  
Chongyang Tao ◽  
Shen Gao ◽  
Mingyue Shang ◽  
Wei Wu ◽  
Dongyan Zhao ◽  
...  

Attention mechanism has become a popular and widely used component in sequence-to-sequence models. However, previous research on neural generative dialogue systems always generates universal responses, and the attention distribution learned by the model always attends to the same semantic aspect. To solve this problem, in this paper, we propose a novel Multi-Head Attention Mechanism (MHAM) for generative dialog systems, which aims at capturing multiple semantic aspects from the user utterance. Further, a regularizer is formulated to force different attention heads to concentrate on certain aspects. The proposed mechanism leads to more informative, diverse, and relevant response generated. Experimental results show that our proposed model outperforms several strong baselines.

2020 ◽  
Vol 21 (S13) ◽  
Author(s):  
Jian Wang ◽  
Mengying Li ◽  
Qishuai Diao ◽  
Hongfei Lin ◽  
Zhihao Yang ◽  
...  

Abstract Background Biomedical document triage is the foundation of biomedical information extraction, which is important to precision medicine. Recently, some neural networks-based methods have been proposed to classify biomedical documents automatically. In the biomedical domain, documents are often very long and often contain very complicated sentences. However, the current methods still find it difficult to capture important features across sentences. Results In this paper, we propose a hierarchical attention-based capsule model for biomedical document triage. The proposed model effectively employs hierarchical attention mechanism and capsule networks to capture valuable features across sentences and construct a final latent feature representation for a document. We evaluated our model on three public corpora. Conclusions Experimental results showed that both hierarchical attention mechanism and capsule networks are helpful in biomedical document triage task. Our method proved itself highly competitive or superior compared with other state-of-the-art methods.


2018 ◽  
Vol 2018 ◽  
pp. 1-11
Author(s):  
A-Yeong Kim ◽  
Hyun-Je Song ◽  
Seong-Bae Park

Dialog state tracking in a spoken dialog system is the task that tracks the flow of a dialog and identifies accurately what a user wants from the utterance. Since the success of a dialog is influenced by the ability of the system to catch the requirements of the user, accurate state tracking is important for spoken dialog systems. This paper proposes a two-step neural dialog state tracker which is composed of an informativeness classifier and a neural tracker. The informativeness classifier which is implemented by a CNN first filters out noninformative utterances in a dialog. Then, the neural tracker estimates dialog states from the remaining informative utterances. The tracker adopts the attention mechanism and the hierarchical softmax for its performance and fast training. To prove the effectiveness of the proposed model, we do experiments on dialog state tracking in the human-human task-oriented dialogs with the standard DSTC4 data set. Our experimental results prove the effectiveness of the proposed model by showing that the proposed model outperforms the neural trackers without the informativeness classifier, the attention mechanism, or the hierarchical softmax.


Author(s):  
Xiaoyang Zheng ◽  
Zeyu Ye ◽  
Jinliang Wu

As a key part of modern industrial machinery, there has been a lot of fault diagnosis methods for gearbox. However, traditional fault diagnosis methods suffer from dependence on prior knowledge. This paper proposed an end-to-end method based on convolutional neural network (CNN), Bidirectional gated recurrent unit (BiGRU), and Attention Mechanism. Among them, the application of BiGRU not only made perfect use of the time sequence of signal, but also saved computing resources more than the same type of networks because of the low amount of calculation. In order to verify the effectiveness and generalization performance of the proposed method, experiments are carried out on two datasets, and the accuracy is calculated by the ten-fold crossvalidation. Compared with the existing fault diagnosis methods, the experimental results show that the proposed model has higher accuracy.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Yuanyuan Cai ◽  
Min Zuo ◽  
Qingchuan Zhang ◽  
Haitao Xiong ◽  
Ke Li

Along with the development of social media on the internet, dialogue systems are becoming more and more intelligent to meet users’ needs for communication, emotion, and social intercourse. Previous studies usually use sequence-to-sequence learning with recurrent neural networks for response generation. However, recurrent-based learning models heavily suffer from the problem of long-distance dependencies in sequences. Moreover, some models neglect crucial information in the dialogue contexts, which leads to uninformative and inflexible responses. To address these issues, we present a bichannel transformer with context encoding (BCTCE) for document-driven conversation. This conversational generator consists of a context encoder, an utterance encoder, and a decoder with attention mechanism. The encoders aim to learn the distributed representation of input texts. The multihop attention mechanism is used in BCTCE to capture the interaction between documents and dialogues. We evaluate the proposed BCTCE by both automatic evaluation and human judgment. The experimental results on the dataset CMU_DoG indicate that the proposed model yields significant improvements over the state-of-the-art baselines on most of the evaluation metrics, and the generated responses of BCTCE are more informative and more relevant to dialogues than baselines.


2020 ◽  
Vol 29 (15) ◽  
pp. 2050250
Author(s):  
Xiongfei Liu ◽  
Bengao Li ◽  
Xin Chen ◽  
Haiyan Zhang ◽  
Shu Zhan

This paper proposes a novel method for person image generation with arbitrary target pose. Given a person image and an arbitrary target pose, our proposed model can synthesize images with the same person but different poses. The Generative Adversarial Networks (GANs) are the major part of the proposed model. Different from the traditional GANs, we add attention mechanism to the generator in order to generate realistic-looking images, we also use content reconstruction with a pretrained VGG16 Net to keep the content consistency between generated images and target images. Furthermore, we test our model on DeepFashion and Market-1501 datasets. The experimental results show that the proposed network performs favorably against state-of-the-art methods.


2020 ◽  
Vol 2020 (14) ◽  
pp. 305-1-305-6
Author(s):  
Tianyu Li ◽  
Camilo G. Aguilar ◽  
Ronald F. Agyei ◽  
Imad A. Hanhan ◽  
Michael D. Sangid ◽  
...  

In this paper, we extend our previous 2D connected-tube marked point process (MPP) model to a 3D connected-tube MPP model for fiber detection. In the 3D case, a tube is represented by a cylinder model with two spherical areas at its ends. The spherical area is used to define connection priors that encourage connection of tubes that belong to the same fiber. Since each long fiber can be fitted by a series of connected short tubes, the proposed model is capable of detecting curved long tubes. We present experimental results on fiber-reinforced composite material images to show the performance of our method.


2021 ◽  
Vol 11 (5) ◽  
pp. 2174
Author(s):  
Xiaoguang Li ◽  
Feifan Yang ◽  
Jianglu Huang ◽  
Li Zhuo

Images captured in a real scene usually suffer from complex non-uniform degradation, which includes both global and local blurs. It is difficult to handle the complex blur variances by a unified processing model. We propose a global-local blur disentangling network, which can effectively extract global and local blur features via two branches. A phased training scheme is designed to disentangle the global and local blur features, that is the branches are trained with task-specific datasets, respectively. A branch attention mechanism is introduced to dynamically fuse global and local features. Complex blurry images are used to train the attention module and the reconstruction module. The visualized feature maps of different branches indicated that our dual-branch network can decouple the global and local blur features efficiently. Experimental results show that the proposed dual-branch blur disentangling network can improve both the subjective and objective deblurring effects for real captured images.


2011 ◽  
Vol 1 ◽  
pp. 375-380
Author(s):  
Shu Ai Wan ◽  
Kai Fang Yang ◽  
Hai Yong Zhou

In this paper the important issue of multimedia quality evaluation is concerned, given the unimodal quality of audio and video. Firstly, the quality integration model recommended in G.1070 is evaluated using experimental results. Theoretical analyses aide empirical observations suggest that the constant coefficients used in the G.1070 model should actually be piecewise adjusted for different levels of audio and visual quality. Then a piecewise function is proposed to perform multimedia quality integration under different levels of the audio and visual quality. Performance gain observed from experimental results substantiates the effectiveness of the proposed model.


Author(s):  
Qianrong Zhou ◽  
Xiaojie Wang ◽  
Xuan Dong

Attention-based models have shown to be effective in learning representations for sentence classification. They are typically equipped with multi-hop attention mechanism. However, existing multi-hop models still suffer from the problem of paying much attention to the most frequently noticed words, which might not be important to classify the current sentence. And there is a lack of explicitly effective way that helps the attention to be shifted out of a wrong part in the sentence. In this paper, we alleviate this problem by proposing a differentiated attentive learning model. It is composed of two branches of attention subnets and an example discriminator. An explicit signal with the loss information of the first attention subnet is passed on to the second one to drive them to learn different attentive preference. The example discriminator then selects the suitable attention subnet for sentence classification. Experimental results on real and synthetic datasets demonstrate the effectiveness of our model.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yongyi Li ◽  
Shiqi Wang ◽  
Shuang Dong ◽  
Xueling Lv ◽  
Changzhi Lv ◽  
...  

At present, person reidentification based on attention mechanism has attracted many scholars’ interests. Although attention module can improve the representation ability and reidentification accuracy of Re-ID model to a certain extent, it depends on the coupling of attention module and original network. In this paper, a person reidentification model that combines multiple attentions and multiscale residuals is proposed. The model introduces combined attention fusion module and multiscale residual fusion module in the backbone network ResNet 50 to enhance the feature flow between residual blocks and better fuse multiscale features. Furthermore, a global branch and a local branch are designed and applied to enhance the channel aggregation and position perception ability of the network by utilizing the dual ensemble attention module, as along as the fine-grained feature expression is obtained by using multiproportion block and reorganization. Thus, the global and local features are enhanced. The experimental results on Market-1501 dataset and DukeMTMC-reID dataset show that the indexes of the presented model, especially Rank-1 accuracy, reach 96.20% and 89.59%, respectively, which can be considered as a progress in Re-ID.


Sign in / Sign up

Export Citation Format

Share Document