A Multi-level Attention Model for Text Matching

Author(s):  
Qiang Sun ◽  
Yue Wu
2022 ◽  
Vol 22 (3) ◽  
pp. 1-21
Author(s):  
Prayag Tiwari ◽  
Amit Kumar Jaiswal ◽  
Sahil Garg ◽  
Ilsun You

Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.


Author(s):  
Yifang Yin ◽  
Meng-Jiun Chiou ◽  
Zhenguang Liu ◽  
Harsh Shrivastava ◽  
Rajiv Ratn Shah ◽  
...  

2021 ◽  
pp. 107454
Author(s):  
Shaoru Guo ◽  
Yong Guan ◽  
Ru Li ◽  
Xiaoli Li ◽  
Hongye Tan
Keyword(s):  

Author(s):  
Shaobo Min ◽  
Xuejin Chen ◽  
Zheng-Jun Zha ◽  
Feng Wu ◽  
Yongdong Zhang

Learning-based methods suffer from a deficiency of clean annotations, especially in biomedical segmentation. Although many semi-supervised methods have been proposed to provide extra training data, automatically generated labels are usually too noisy to retrain models effectively. In this paper, we propose a Two-Stream Mutual Attention Network (TSMAN) that weakens the influence of back-propagated gradients caused by incorrect labels, thereby rendering the network robust to unclean data. The proposed TSMAN consists of two sub-networks that are connected by three types of attention models in different layers. The target of each attention model is to indicate potentially incorrect gradients in a certain layer for both sub-networks by analyzing their inferred features using the same input. In order to achieve this purpose, the attention models are designed based on the propagation analysis of noisy gradients at different layers. This allows the attention models to effectively discover incorrect labels and weaken their influence during parameter updating process. By exchanging multi-level features within two-stream architecture, the effects of noisy labels in each sub-network are reduced by decreasing the noisy gradients. Furthermore, a hierarchical distillation is developed to provide reliable pseudo labels for unlabelded data, which further boosts the performance of TSMAN. The experiments using both HVSMR 2016 and BRATS 2015 benchmarks demonstrate that our semi-supervised learning framework surpasses the state-of-the-art fully-supervised results.


2018 ◽  
Vol 19 (11) ◽  
pp. 3475-3485 ◽  
Author(s):  
Heng Fan ◽  
Xue Mei ◽  
Danil Prokhorov ◽  
Haibin Ling

2019 ◽  
Vol 127 ◽  
pp. 156-164 ◽  
Author(s):  
Yichao Yan ◽  
Bingbing Ni ◽  
Jinxian Liu ◽  
Xiaokang Yang
Keyword(s):  

2021 ◽  
Author(s):  
Jinsheng Ji ◽  
Yiyou Guo ◽  
Zhen Yang ◽  
Tao Zhang ◽  
Xiankai Lu

Sign in / Sign up

Export Citation Format

Share Document