Multistep Flow Prediction on Car-Sharing Systems: A Multi-Graph Convolutional Neural Network with Attention Mechanism

Multistep flow prediction is an essential task for the car-sharing systems. An accurate flow prediction model can help system operators to pre-allocate the cars to meet the demand of users. However, this task is challenging due to the complex spatial and temporal relations among stations. Existing works only considered temporal relations (e.g. using LSTM) or spatial relations (e.g. using CNN) independently. In this paper, we propose an attention to multi-graph convolutional sequence-to-sequence model (AMGC-Seq2Seq), which is a novel deep learning model for multistep flow prediction. The proposed model uses the encoder–decoder architecture, wherein the encoder part, spatial and temporal relations are encoded simultaneously. Then the encoded information is passed to the decoder to generate multistep outputs. In this work, specific multiple graphs are constructed to reflect spatial relations from different aspects, and we model them by using the proposed multi-graph convolution. Attention mechanism is also used to capture the important relations from previous information. Experiments on a large-scale real-world car-sharing dataset demonstrate the effectiveness of our approach over state-of-the-art methods.

Download Full-text

Translating with Bilingual Topic Knowledge for Neural Machine Translation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33017257 ◽

2019 ◽

Vol 33 ◽

pp. 7257-7264

Author(s):

Xiangpeng Wei ◽

Yue Hu ◽

Luxi Xing ◽

Yipeng Wang ◽

Li Gao

Keyword(s):

Machine Translation ◽

State Of The Art ◽

Attention Mechanism ◽

Target Domain ◽

Neural Machine Translation ◽

Topic Knowledge ◽

Proposed Model ◽

Decoder Architecture ◽

Source Sentence ◽

Hidden States

The dominant neural machine translation (NMT) models that based on the encoder-decoder architecture have recently achieved the state-of-the-art performance. Traditionally, the NMT models only depend on the representations learned during training for mapping a source sentence into the target domain. However, the learned representations often suffer from implicit and inadequately informed properties. In this paper, we propose a novel bilingual topic enhanced NMT (BLTNMT) model to improve translation performance by incorporating bilingual topic knowledge into NMT. Specifically, the bilingual topic knowledge is included into the hidden states of both encoder and decoder, as well as the attention mechanism. With this new setting, the proposed BLT-NMT has access to the background knowledge implied in bilingual topics which is beyond the sequential context, and enables the attention mechanism to attend to topic-level attentions for generating accurate target words during translation. Experimental results show that the proposed model consistently outperforms the traditional RNNsearch and the previous topic-informed NMT on Chinese-English and EnglishGerman translation tasks. We also introduce the bilingual topic knowledge into the newly emerged Transformer base model on English-German translation and achieve a notable improvement.

Download Full-text

Multi-Turn Chatbot Based on Query-Context Attentions and Dual Wasserstein Generative Adversarial Networks

Applied Sciences ◽

10.3390/app9183908 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3908 ◽

Cited By ~ 3

Author(s):

Jintae Kim ◽

Shinhyeok Oh ◽

Oh-Woog Kwon ◽

Harksoo Kim

Keyword(s):

Performance Measures ◽

State Of The Art ◽

Attention Mechanism ◽

Generative Adversarial Networks ◽

Training Method ◽

Adversarial Networks ◽

Proposed Model ◽

Previous State ◽

Vector Representations

To generate proper responses to user queries, multi-turn chatbot models should selectively consider dialogue histories. However, previous chatbot models have simply concatenated or averaged vector representations of all previous utterances without considering contextual importance. To mitigate this problem, we propose a multi-turn chatbot model in which previous utterances participate in response generation using different weights. The proposed model calculates the contextual importance of previous utterances by using an attention mechanism. In addition, we propose a training method that uses two types of Wasserstein generative adversarial networks to improve the quality of responses. In experiments with the DailyDialog dataset, the proposed model outperformed the previous state-of-the-art models based on various performance measures.

Download Full-text

Relational Graph Neural Network with Hierarchical Attention for Knowledge Graph Completion

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6508 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9612-9619

Author(s):

Zhao Zhang ◽

Fuzhen Zhuang ◽

Hengshu Zhu ◽

Zhiping Shi ◽

Hui Xiong ◽

...

Keyword(s):

Neural Network ◽

Missing Values ◽

State Of The Art ◽

Attention Mechanism ◽

Knowledge Graph ◽

Incomplete Knowledge ◽

Neighborhood Information ◽

Local Neighborhood ◽

Proposed Model ◽

Knowledge Graphs

The rapid proliferation of knowledge graphs (KGs) has changed the paradigm for various AI-related applications. Despite their large sizes, modern KGs are far from complete and comprehensive. This has motivated the research in knowledge graph completion (KGC), which aims to infer missing values in incomplete knowledge triples. However, most existing KGC models treat the triples in KGs independently without leveraging the inherent and valuable information from the local neighborhood surrounding an entity. To this end, we propose a Relational Graph neural network with Hierarchical ATtention (RGHAT) for the KGC task. The proposed model is equipped with a two-level attention mechanism: (i) the first level is the relation-level attention, which is inspired by the intuition that different relations have different weights for indicating an entity; (ii) the second level is the entity-level attention, which enables our model to highlight the importance of different neighboring entities under the same relation. The hierarchical attention mechanism makes our model more effective to utilize the neighborhood information of an entity. Finally, we extensively validate the superiority of RGHAT against various state-of-the-art baselines.

Download Full-text

Visual-Semantic Graph Reasoning for Pedestrian Attribute Recognition

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018634 ◽

2019 ◽

Vol 33 ◽

pp. 8634-8641 ◽

Cited By ~ 4

Author(s):

Qiaozhe Li ◽

Xin Zhao ◽

Ran He ◽

Kaiqi Huang

Keyword(s):

Large Scale ◽

State Of The Art ◽

Relational Learning ◽

Spatial Relations ◽

Semantic Relations ◽

Prediction Problem ◽

Convolutional Network ◽

Semantic Graph ◽

Spatial Graph ◽

Attribute Recognition

Pedestrian attribute recognition in surveillance is a challenging task due to poor image quality, significant appearance variations and diverse spatial distribution of different attributes. This paper treats pedestrian attribute recognition as a sequential attribute prediction problem and proposes a novel visual-semantic graph reasoning framework to address this problem. Our framework contains a spatial graph and a directed semantic graph. By performing reasoning using the Graph Convolutional Network (GCN), one graph captures spatial relations between regions and the other learns potential semantic relations between attributes. An end-to-end architecture is presented to perform mutual embedding between these two graphs to guide the relational learning for each other. We verify the proposed framework on three large scale pedestrian attribute datasets including PETA, RAP, and PA100k. Experiments show superiority of the proposed method over state-of-the-art methods and effectiveness of our joint GCN structures for sequential attribute prediction.

Download Full-text

Commonsense Knowledge Aware Conversation Generation with Graph Attention

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/643 ◽

2018 ◽

Cited By ~ 35

Author(s):

Hao Zhou ◽

Tom Young ◽

Minlie Huang ◽

Haizhou Zhao ◽

Jingfang Xu ◽

...

Keyword(s):

Language Processing ◽

Large Scale ◽

Semantic Information ◽

Attention Mechanism ◽

Generation Model ◽

Dynamic Graph ◽

Commonsense Knowledge ◽

Word Generation ◽

Proposed Model ◽

Knowledge Graphs

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus supports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism. This is the first attempt that uses large-scale commonsense knowledge in conversation generation. Furthermore, unlike existing models that use knowledge triples (entities) separately and independently, our model treats each knowledge graph as a whole, which encodes more structured, connected semantic information in the graphs. Experiments show that the proposed model can generate more appropriate and informative responses than state-of-the-art baselines.

Download Full-text

A Deep Learning Framework to Predict Routability for FPGA Circuit Placement

ACM Transactions on Reconfigurable Technology and Systems ◽

10.1145/3465373 ◽

2021 ◽

Vol 14 (3) ◽

pp. 1-28

Author(s):

Abeer Al-Hyari ◽

Hannah Szentimrey ◽

Ahmed Shamli ◽

Timothy Martin ◽

Gary Gréwal ◽

...

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Parameter Tuning ◽

Learning Framework ◽

Proposed Model ◽

Field Programmable ◽

Circuit Placement ◽

Detailed Placement ◽

Place And Route ◽

Deep Learning Model

The ability to accurately and efficiently estimate the routability of a circuit based on its placement is one of the most challenging and difficult tasks in the Field Programmable Gate Array (FPGA) flow. In this article, we present a novel, deep learning framework based on a Convolutional Neural Network (CNN) model for predicting the routability of a placement. Since the performance of the CNN model is strongly dependent on the hyper-parameters selected for the model, we perform an exhaustive parameter tuning that significantly improves the model’s performance and we also avoid overfitting the model. We also incorporate the deep learning model into a state-of-the-art placement tool and show how the model can be used to (1) avoid costly, but futile, place-and-route iterations, and (2) improve the placer’s ability to produce routable placements for hard-to-route circuits using feedback based on routability estimates generated by the proposed model. The model is trained and evaluated using over 26K placement images derived from 372 benchmarks supplied by Xilinx Inc. We also explore several opportunities to further improve the reliability of the predictions made by the proposed DLRoute technique by splitting the model into two separate deep learning models for (a) global and (b) detailed placement during the optimization process. Experimental results show that the proposed framework achieves a routability prediction accuracy of 97% while exhibiting runtimes of only a few milliseconds.

Download Full-text

Sequence to sequence learning with attention mechanism for short-term passenger flow prediction in large-scale metro system

Transportation Research Part C Emerging Technologies ◽

10.1016/j.trc.2019.08.005 ◽

2019 ◽

Vol 107 ◽

pp. 287-300 ◽

Cited By ~ 17

Author(s):

Siyu Hao ◽

Der-Horng Lee ◽

De Zhao

Keyword(s):

Sequence Learning ◽

Large Scale ◽

Attention Mechanism ◽

Short Term ◽

Passenger Flow ◽

Flow Prediction ◽

Metro System

Download Full-text

Biomedical document triage using a hierarchical attention-based capsule network

BMC Bioinformatics ◽

10.1186/s12859-020-03673-5 ◽

2020 ◽

Vol 21 (S13) ◽

Author(s):

Jian Wang ◽

Mengying Li ◽

Qishuai Diao ◽

Hongfei Lin ◽

Zhihao Yang ◽

...

Keyword(s):

Neural Networks ◽

Information Extraction ◽

Precision Medicine ◽

State Of The Art ◽

Attention Mechanism ◽

Feature Representation ◽

Experimental Results ◽

Biomedical Domain ◽

Proposed Model ◽

Document Triage

Abstract Background Biomedical document triage is the foundation of biomedical information extraction, which is important to precision medicine. Recently, some neural networks-based methods have been proposed to classify biomedical documents automatically. In the biomedical domain, documents are often very long and often contain very complicated sentences. However, the current methods still find it difficult to capture important features across sentences. Results In this paper, we propose a hierarchical attention-based capsule model for biomedical document triage. The proposed model effectively employs hierarchical attention mechanism and capsule networks to capture valuable features across sentences and construct a final latent feature representation for a document. We evaluated our model on three public corpora. Conclusions Experimental results showed that both hierarchical attention mechanism and capsule networks are helpful in biomedical document triage task. Our method proved itself highly competitive or superior compared with other state-of-the-art methods.

Download Full-text

Multi-Task Deep Learning Model with an Attention Mechanism for Ship Accident Sentence Prediction

Applied Sciences ◽

10.3390/app12010233 ◽

2021 ◽

Vol 12 (1) ◽

pp. 233

Author(s):

Ho-Min Park ◽

Jae-Hoon Kim

Keyword(s):

Deep Learning ◽

Learning Model ◽

Attention Mechanism ◽

Maritime Safety ◽

Proposed Model ◽

Deep Learning Model

The number of ship accidents occurring in the Korean ocean has been steadily increasing year by year. The Korea Maritime Safety Tribunal (KMST) has published verdicts to ensure that the relevant personnel can share judgment on these accidents. As of 2020, there have been 3156 ship accidents; thus, it is difficult for the relevant personnel to study these various accidents by only reading the verdicts. Therefore, in this study, we propose a multi-task deep learning model with an attention mechanism for predicting the sentencing of ship accidents. The tasks are accident types, applied articles, and the sentencing of ship accidents. The proposed model was tested under verdicts published by the KMST between 2010 and 2019. Through experiments, we show that the proposed model can improve the performance of sentence prediction and can assist the relevant personnel to study these accidents.

Download Full-text

Encoder-Decoder Architecture for Ultrasound IMC Segmentation and cIMT Measurement

Sensors ◽

10.3390/s21206839 ◽

2021 ◽

Vol 21 (20) ◽

pp. 6839

Author(s):

Aisha Al-Mohannadi ◽

Somaya Al-Maadeed ◽

Omar Elharrouss ◽

Kishor Kumar Sadasivuni

Keyword(s):

Deep Learning ◽

Early Diagnosis ◽

Large Scale ◽

Intima Media Thickness ◽

Semantic Segmentation ◽

Learning Techniques ◽

Proposed Model ◽

Decoder Architecture ◽

Huge Impact ◽

Good Learning

Cardiovascular diseases (CVDs) have shown a huge impact on the number of deaths in the world. Thus, common carotid artery (CCA) segmentation and intima-media thickness (IMT) measurements have been significantly implemented to perform early diagnosis of CVDs by analyzing IMT features. Using computer vision algorithms on CCA images is not widely used for this type of diagnosis, due to the complexity and the lack of dataset to do it. The advancement of deep learning techniques has made accurate early diagnosis from images possible. In this paper, a deep-learning-based approach is proposed to apply semantic segmentation for intima-media complex (IMC) and to calculate the cIMT measurement. In order to overcome the lack of large-scale datasets, an encoder-decoder-based model is proposed using multi-image inputs that can help achieve good learning for the model using different features. The obtained results were evaluated using different image segmentation metrics which demonstrate the effectiveness of the proposed architecture. In addition, IMT thickness is computed, and the experiment showed that the proposed model is robust and fully automated compared to the state-of-the-art work.

Download Full-text