Self-attention mechanisms have recently been embraced for a broad range of text-matching applications. Self-attention model takes only one sentence as an input with no extra information, i.e., one can utilize the final hidden state or pooling. However, text-matching problems can be interpreted either in symmetrical or asymmetrical scopes. For instance, paraphrase detection is an asymmetrical task, while textual entailment classification and question-answer matching are considered asymmetrical tasks. In this article, we leverage attractive properties of self-attention mechanism and proposes an attention-based network that incorporates three key components for inter-sequence attention: global pointwise features, preceding attentive features, and contextual features while updating the rest of the components. Our model follows evaluation on two benchmark datasets cover tasks of textual entailment and question-answer matching. The proposed efficient Self-attention-driven Network for Text Matching outperforms the state of the art on the Stanford Natural Language Inference and WikiQA datasets with much fewer parameters.
Image captioning refers to the process of generating a textual description that describes objects and activities present in a given image. It connects two fields of artificial intelligence, computer vision, and natural language processing. Computer vision and natural language processing deal with image understanding and language modeling, respectively. In the existing literature, most of the works have been carried out for image captioning in the English language. This article presents a novel method for image captioning in the Hindi language using encoder–decoder based deep learning architecture with efficient channel attention. The key contribution of this work is the deployment of an efficient channel attention mechanism with bahdanau attention and a gated recurrent unit for developing an image captioning model in the Hindi language. Color images usually consist of three channels, namely red, green, and blue. The channel attention mechanism focuses on an image’s important channel while performing the convolution, which is basically to assign higher importance to specific channels over others. The channel attention mechanism has been shown to have great potential for improving the efficiency of deep convolution neural networks (CNNs). The proposed encoder–decoder architecture utilizes the recently introduced ECA-NET CNN to integrate the channel attention mechanism. Hindi is the fourth most spoken language globally, widely spoken in India and South Asia; it is India’s official language. By translating the well-known MSCOCO dataset from English to Hindi, a dataset for image captioning in Hindi is manually created. The efficiency of the proposed method is compared with other baselines in terms of Bilingual Evaluation Understudy (BLEU) scores, and the results obtained illustrate that the method proposed outperforms other baselines. The proposed method has attained improvements of 0.59%, 2.51%, 4.38%, and 3.30% in terms of BLEU-1, BLEU-2, BLEU-3, and BLEU-4 scores, respectively, with respect to the state-of-the-art. Qualities of the generated captions are further assessed manually in terms of adequacy and fluency to illustrate the proposed method’s efficacy.
Recommender algorithms combining knowledge graph and graph convolutional network are becoming more and more popular recently. Specifically, attributes describing the items to be recommended are often used as additional information. These attributes along with items are highly interconnected, intrinsically forming a Knowledge Graph (KG). These algorithms use KGs as an auxiliary data source to alleviate the negative impact of data sparsity. However, these graph convolutional network based algorithms do not distinguish the importance of different neighbors of entities in the KG, and according to Pareto’s principle, the important neighbors only account for a small proportion. These traditional algorithms can not fully mine the useful information in the KG.
To fully release the power of KGs for building recommender systems, we propose in this article KRAN, a Knowledge Refining Attention Network, which can subtly capture the characteristics of the KG and thus boost recommendation performance. We first introduce a traditional attention mechanism into the KG processing, making the knowledge extraction more targeted, and then propose a refining mechanism to improve the traditional attention mechanism to extract the knowledge in the KG more effectively. More precisely, KRAN is designed to use our proposed knowledge-refining attention mechanism to aggregate and obtain the representations of the entities (both attributes and items) in the KG. Our knowledge-refining attention mechanism first measures the relevance between an entity and it’s neighbors in the KG by attention coefficients, and then further refines the attention coefficients using a “richer-get-richer” principle, in order to focus on highly relevant neighbors while eliminating less relevant neighbors for noise reduction. In addition, for the item cold start problem, we propose KRAN-CD, a variant of KRAN, which further incorporates pre-trained KG embeddings to handle cold start items. Experiments show that KRAN and KRAN-CD consistently outperform state-of-the-art baselines across different settings.