scholarly journals Attention-Based Graph Convolutional Network for Zero-Shot Learning with Pre-Training

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Xuefei Wu ◽  
Mingjiang Liu ◽  
Bo Xin ◽  
Zhangqing Zhu ◽  
Gang Wang

Zero-shot learning (ZSL) is a powerful and promising learning paradigm for classifying instances that have not been seen in training. Although graph convolutional networks (GCNs) have recently shown great potential for the ZSL tasks, these models cannot adjust the constant connection weights between the nodes in knowledge graph and the neighbor nodes contribute equally to classify the central node. In this study, we apply an attention mechanism to adjust the connection weights adaptively to learn more important information for classifying unseen target nodes. First, we propose an attention graph convolutional network for zero-shot learning (AGCNZ) by integrating the attention mechanism and GCN directly. Then, in order to prevent the dilution of knowledge from distant nodes, we apply the dense graph propagation (DGP) model for the ZSL tasks and propose an attention dense graph propagation model for zero-shot learning (ADGPZ). Finally, we propose a modified loss function with a relaxation factor to further improve the performance of the learned classifier. Experimental results under different pre-training settings verified the effectiveness of the proposed attention-based models for ZSL.

2021 ◽  
Vol 11 (16) ◽  
pp. 7734
Author(s):  
Ningyi Mao ◽  
Wenti Huang ◽  
Hai Zhong

Distantly supervised relation extraction is the most popular technique for identifying semantic relation between two entities. Most prior models only focus on the supervision information present in training sentences. In addition to training sentences, external lexical resource and knowledge graphs often contain other relevant prior knowledge. However, relation extraction models usually ignore such readily available information. Moreover, previous works only utilize a selective attention mechanism over sentences to alleviate the impact of noise, they lack the consideration of the implicit interaction between sentences with relation facts. In this paper, (1) a knowledge-guided graph convolutional network is proposed based on the word-level attention mechanism to encode the sentences. It can capture the key words and cue phrases to generate expressive sentence-level features by attending to the relation indicators obtained from the external lexical resource. (2) A knowledge-guided sentence selector is proposed, which explores the semantic and structural information of triples from knowledge graph as sentence-level knowledge attention to distinguish the importance of each individual sentence. Experimental results on two widely used datasets, NYT-FB and GDS, show that our approach is able to efficiently use the prior knowledge from the external lexical resource and knowledge graph to enhance the performance of distantly supervised relation extraction.


Author(s):  
Teng Jiang ◽  
Liang Gong ◽  
Yupu Yang

Attention-based encoder–decoder framework has greatly improved image caption generation tasks. The attention mechanism plays a transitional role by transforming static image features into sequential captions. To generate reasonable captions, it is of great significance to detect spatial characteristics of images. In this paper, we propose a spatial relational attention approach to consider spatial positions and attributes. Image features are firstly weighted by the attention mechanism. Then they are concatenated with contextual features to form a spatial–visual tensor. The tensor is feature extracted by a fully convolutional network to produce visual concepts for the decoder network. The fully convolutional layers maintain spatial topology of images. Experiments conducted on the three benchmark datasets, namely Flickr8k, Flickr30k and MSCOCO, demonstrate the effectiveness of our proposed approach. Captions generated by the spatial relational attention method precisely capture spatial relations of objects.


Author(s):  
Junyu Gao ◽  
Tianzhu Zhang ◽  
Changsheng Xu

Recently, with the ever-growing action categories, zero-shot action recognition (ZSAR) has been achieved by automatically mining the underlying concepts (e.g., actions, attributes) in videos. However, most existing methods only exploit the visual cues of these concepts but ignore external knowledge information for modeling explicit relationships between them. In fact, humans have remarkable ability to transfer knowledge learned from familiar classes to recognize unfamiliar classes. To narrow the knowledge gap between existing methods and humans, we propose an end-to-end ZSAR framework based on a structured knowledge graph, which can jointly model the relationships between action-attribute, action-action, and attribute-attribute. To effectively leverage the knowledge graph, we design a novel Two-Stream Graph Convolutional Network (TS-GCN) consisting of a classifier branch and an instance branch. Specifically, the classifier branch takes the semantic-embedding vectors of all the concepts as input, then generates the classifiers for action categories. The instance branch maps the attribute embeddings and scores of each video instance into an attribute-feature space. Finally, the generated classifiers are evaluated on the attribute features of each video, and a classification loss is adopted for optimizing the whole network. In addition, a self-attention module is utilized to model the temporal information of videos. Extensive experimental results on three realistic action benchmarks Olympic Sports, HMDB51 and UCF101 demonstrate the favorable performance of our proposed framework.


2021 ◽  
Vol 18 (6) ◽  
pp. 9669-9684
Author(s):  
Xing Hu ◽  
◽  
Minghui Yao ◽  
Dawei Zhang

<abstract> <p>This paper proposed an end-to-end road crack segmentation model based on attention mechanism and deep FCN with generative adversarial learning. We create a segmentation network by introducing a visual attention mechanism and residual module to a fully convolutional network(FCN) to capture richer local features and more global semantic features and get a better segment result. Besides, we use an adversarial network consisting of convolutional layers as a discrimination network. The main contributions of this work are as follows: 1) We introduce a CNN model as a discriminate network to realize adversarial learning to guide the training of the segmentation network, which is trained in a min-max way: the discrimination network is trained by maximizing the loss function, while the segmentation network is trained with the only gradient passed by the discrimination network and aim at minimizing the loss function, and finally an optimal segmentation network is obtained; 2) We add the residual modular and the visual attention mechanism to U-Net, which makes the segmentation results more robust, refined and smooth; 3) Extensive experiments are conducted on three public road crack datasets to evaluate the performance of our proposed model. Qualitative and quantitative comparisons between the proposed method and the state-of-the-art methods show that the proposed method outperforms or is comparable to the state-of-the-art methods in both F1 score and precision. In particular, compared with U-Net, the mIoU of our proposed method is increased about 3%~17% compared with the three public datasets.</p> </abstract>


2022 ◽  
Vol 16 (2) ◽  
pp. 1-20
Author(s):  
Zhenyu Zhang ◽  
Lei Zhang ◽  
Dingqi Yang ◽  
Liu Yang

Recommender algorithms combining knowledge graph and graph convolutional network are becoming more and more popular recently. Specifically, attributes describing the items to be recommended are often used as additional information. These attributes along with items are highly interconnected, intrinsically forming a Knowledge Graph (KG). These algorithms use KGs as an auxiliary data source to alleviate the negative impact of data sparsity. However, these graph convolutional network based algorithms do not distinguish the importance of different neighbors of entities in the KG, and according to Pareto’s principle, the important neighbors only account for a small proportion. These traditional algorithms can not fully mine the useful information in the KG. To fully release the power of KGs for building recommender systems, we propose in this article KRAN, a Knowledge Refining Attention Network, which can subtly capture the characteristics of the KG and thus boost recommendation performance. We first introduce a traditional attention mechanism into the KG processing, making the knowledge extraction more targeted, and then propose a refining mechanism to improve the traditional attention mechanism to extract the knowledge in the KG more effectively. More precisely, KRAN is designed to use our proposed knowledge-refining attention mechanism to aggregate and obtain the representations of the entities (both attributes and items) in the KG. Our knowledge-refining attention mechanism first measures the relevance between an entity and it’s neighbors in the KG by attention coefficients, and then further refines the attention coefficients using a “richer-get-richer” principle, in order to focus on highly relevant neighbors while eliminating less relevant neighbors for noise reduction. In addition, for the item cold start problem, we propose KRAN-CD, a variant of KRAN, which further incorporates pre-trained KG embeddings to handle cold start items. Experiments show that KRAN and KRAN-CD consistently outperform state-of-the-art baselines across different settings.


2019 ◽  
Vol 11 (2) ◽  
pp. 159 ◽  
Author(s):  
Bei Fang ◽  
Ying Li ◽  
Haokui Zhang ◽  
Jonathan Chan

Hyperspectral images (HSIs) data that is typically presented in 3-D format offers an opportunity for 3-D networks to extract spectral and spatial features simultaneously. In this paper, we propose a novel end-to-end 3-D dense convolutional network with spectral-wise attention mechanism (MSDN-SA) for HSI classification. The proposed MSDN-SA exploits 3-D dilated convolutions to simultaneously capture the spectral and spatial features at different scales, and densely connects all 3-D feature maps with each other. In addition, a spectral-wise attention mechanism is introduced to enhance the distinguishability of spectral features, which improves the classification performance of the trained models. Experimental results on three HSI datasets demonstrate that our MSDN-SA achieves competitive performance for HSI classification.


Author(s):  
Fan Xiong ◽  
Jianliang Gao

Graph convolutional network (GCN) is a promising approach that has recently been used to resolve knowledge graph alignment. In this paper, we propose a new method to entity alignment for cross-lingual knowledge graph. In the method, we design a scheme of attribute embedding for GCN training. Furthermore, GCN model utilizes the attribute embedding and structure embedding to abstract graph features simultaneously. Our preliminary experiments show that the proposed method outperforms the state-of-the-art GCN-based method.


2021 ◽  
Author(s):  
Xing Wei ◽  
Jiangjiang Liu

Knowledge Graph (KG) related recommendation method is advanced in dealing with cold start problems and sparse data. Knowledge Graph Convolutional Network (KGCN) is an end-to-end framework that has been proved to have the ability to capture latent item-entity features by mining their associated attributes on the KG. In KGCN, aggregator plays a key role for extracting information from the high-order structure. In this work, we proposed Knowledge Graph Processor (KGP) for pre-processing data and building corresponding knowledge graphs. A knowledge graph for the Yelp Open dataset was constructed with KGP. In addition, we investigated the impacts of various aggregators with three nonlinear functions on KGCN with Yelp Open dataset KG.


2021 ◽  
Vol 11 (4) ◽  
pp. 1528
Author(s):  
Jie Liu ◽  
Peiyu Liu ◽  
Zhenfang Zhu ◽  
Xiaowen Li ◽  
Guangtao Xu

Aspect-based sentiment classification aims at determining the corresponding sentiment of a particular aspect. Many sophisticated approaches, such as attention mechanisms and Graph Convolutional Networks, have been widely used to address this challenge. However, most of the previous methods have not well analyzed the role of words and long-distance dependencies, and the interaction between context and aspect terms is not well realized, which greatly limits the effectiveness of the model. In this paper, we propose an effective and novel method using attention mechanism and graph convolutional network (ATGCN). Firstly, we make full use of multi-head attention and point-wise convolution transformation to obtain the hidden state. Secondly, we introduce position coding in the model, and use Graph Convolutional Networks to obtain syntactic information and long-distance dependencies. Finally, the interaction between context and aspect terms is further realized by bidirectional attention. Experiments on three benchmarking collections indicate the effectiveness of ATGCN.


2021 ◽  
Vol 11 (15) ◽  
pp. 6975
Author(s):  
Tao Zhang ◽  
Lun He ◽  
Xudong Li ◽  
Guoqing Feng

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.


Sign in / Sign up

Export Citation Format

Share Document