Passenger Mobility Prediction via Representation Learning for Dynamic Directed and Weighted Graphs

In recent years, ride-hailing services have been increasingly prevalent, as they provide huge convenience for passengers. As a fundamental problem, the timely prediction of passenger demands in different regions is vital for effective traffic flow control and route planning. As both spatial and temporal patterns are indispensable passenger demand prediction, relevant research has evolved from pure time series to graph-structured data for modeling historical passenger demand data, where a snapshot graph is constructed for each time slot by connecting region nodes via different relational edges (origin-destination relationship, geographical distance, etc.). Consequently, the spatiotemporal passenger demand records naturally carry dynamic patterns in the constructed graphs, where the edges also encode important information about the directions and volume (i.e., weights) of passenger demands between two connected regions. aspects in the graph-structure data. representation for DDW is the key to solve the prediction problem. However, existing graph-based solutions fail to simultaneously consider those three crucial aspects of dynamic, directed, and weighted graphs, leading to limited expressiveness when learning graph representations for passenger demand prediction. Therefore, we propose a novel spatiotemporal graph attention network, namely Gallat ( G raph prediction with all at tention) as a solution. In Gallat, by comprehensively incorporating those three intrinsic properties of dynamic directed and weighted graphs, we build three attention layers to fully capture the spatiotemporal dependencies among different regions across all historical time slots. Moreover, the model employs a subtask to conduct pretraining so that it can obtain accurate results more quickly. We evaluate the proposed model on real-world datasets, and our experimental results demonstrate that Gallat outperforms the state-of-the-art approaches.

Download Full-text

STG2Seq: Spatial-Temporal Graph to Sequence Model for Multi-step Passenger Demand Forecasting

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/274 ◽

2019 ◽

Cited By ~ 5

Author(s):

Lei Bai ◽

Lina Yao ◽

Salil S. Kanhere ◽

Xianzhi Wang ◽

Quan Z. Sheng

Keyword(s):

State Of The Art ◽

Demand Forecasting ◽

Short Term ◽

Demand Prediction ◽

On Demand ◽

Passenger Demand ◽

Temporal Correlations ◽

Output Module ◽

Real World Datasets

Multi-step passenger demand forecasting is a crucial task in on-demand vehicle sharing services. However, predicting passenger demand is generally challenging due to the nonlinear and dynamic spatial-temporal dependencies. In this work, we propose to model multi-step citywide passenger demand prediction based on a graph and use a hierarchical graph convolutional structure to capture both spatial and temporal correlations simultaneously. Our model consists of three parts: 1) a long-term encoder to encode historical passenger demands; 2) a short-term encoder to derive the next-step prediction for generating multi-step prediction; 3) an attention-based output module to model the dynamic temporal and channel-wise information. Experiments on three real-world datasets show that our model consistently outperforms many baseline methods and state-of-the-art models.

Download Full-text

Auto-weighted concept factorization for joint feature map and data representation learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200298 ◽

2021 ◽

pp. 1-13

Author(s):

Yikai Zhang ◽

Yong Peng ◽

Hongyu Bian ◽

Yuan Ge ◽

Feiwei Qin ◽

...

Keyword(s):

Objective Function ◽

Optimization Procedure ◽

Feature Space ◽

Representation Learning ◽

Data Representation ◽

Data Sets ◽

Reconstruction Process ◽

Factorization Model ◽

Efficient Data ◽

Concept Factorization

Concept factorization (CF) is an effective matrix factorization model which has been widely used in many applications. In CF, the linear combination of data points serves as the dictionary based on which CF can be performed in both the original feature space as well as the reproducible kernel Hilbert space (RKHS). The conventional CF treats each dimension of the feature vector equally during the data reconstruction process, which might violate the common sense that different features have different discriminative abilities and therefore contribute differently in pattern recognition. In this paper, we introduce an auto-weighting variable into the conventional CF objective function to adaptively learn the corresponding contributions of different features and propose a new model termed Auto-Weighted Concept Factorization (AWCF). In AWCF, on one hand, the feature importance can be quantitatively measured by the auto-weighting variable in which the features with better discriminative abilities are assigned larger weights; on the other hand, we can obtain more efficient data representation to depict its semantic information. The detailed optimization procedure to AWCF objective function is derived whose complexity and convergence are also analyzed. Experiments are conducted on both synthetic and representative benchmark data sets and the clustering results demonstrate the effectiveness of AWCF in comparison with the related models.

Download Full-text

Relation Representation Learning Via Signed Graph Mutual Information Maximization for Trust Prediction

Symmetry ◽

10.3390/sym13010115 ◽

2021 ◽

Vol 13 (1) ◽

pp. 115

Author(s):

Yongjun Jing ◽

Hao Wang ◽

Kun Shao ◽

Xing Huo

Keyword(s):

Mutual Information ◽

Representation Learning ◽

Signed Graph ◽

Trust Prediction ◽

Trust Networks ◽

Trust Relation ◽

Information Maximization ◽

Real World Datasets ◽

Negative Links ◽

Mutual Information Maximization

Trust prediction is essential to enhancing reliability and reducing risk from the unreliable node, especially for online applications in open network environments. An essential fact in trust prediction is to measure the relation of both the interacting entities accurately. However, most of the existing methods infer the trust relation between interacting entities usually rely on modeling the similarity between nodes on a graph and ignore semantic relation and the influence of negative links (e.g., distrust relation). In this paper, we proposed a relation representation learning via signed graph mutual information maximization (called SGMIM). In SGMIM, we incorporate a translation model and positive point-wise mutual information to enhance the relation representations and adopt Mutual Information Maximization to align the entity and relation semantic spaces. Moreover, we further develop a sign prediction model for making accurate trust predictions. We conduct link sign prediction in trust networks based on learned the relation representation. Extensive experimental results in four real-world datasets on trust prediction task show that SGMIM significantly outperforms state-of-the-art baseline methods.

Download Full-text

Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks for Fake News Detection

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3451215 ◽

2021 ◽

Vol 17 (3) ◽

pp. 1-23

Author(s):

Shengsheng Qian ◽

Jun Hu ◽

Quan Fang ◽

Changsheng Xu

Keyword(s):

Social Media ◽

Visual Information ◽

Representation Learning ◽

Fake News ◽

Unified Framework ◽

Model Learning ◽

Convolutional Network ◽

Textual Information ◽

Convolutional Networks ◽

Real World Datasets

In this article, we focus on fake news detection task and aim to automatically identify the fake news from vast amount of social media posts. To date, many approaches have been proposed to detect fake news, which includes traditional learning methods and deep learning-based models. However, there are three existing challenges: (i) How to represent social media posts effectively, since the post content is various and highly complicated; (ii) how to propose a data-driven method to increase the flexibility of the model to deal with the samples in different contexts and news backgrounds; and (iii) how to fully utilize the additional auxiliary information (the background knowledge and multi-modal information) of posts for better representation learning. To tackle the above challenges, we propose a novel Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks (KMAGCN) to capture the semantic representations by jointly modeling the textual information, knowledge concepts, and visual information into a unified framework for fake news detection. We model posts as graphs and use a knowledge-aware multi-modal adaptive graph learning principal for the effective feature learning. Compared with existing methods, the proposed KMAGCN addresses challenges from three aspects: (1) It models posts as graphs to capture the non-consecutive and long-range semantic relations; (2) it proposes a novel adaptive graph convolutional network to handle the variability of graph data; and (3) it leverages textual information, knowledge concepts and visual information jointly for model learning. We have conducted extensive experiments on three public real-world datasets and superior results demonstrate the effectiveness of KMAGCN compared with other state-of-the-art algorithms.

Download Full-text

Demand prediction model for regional railway services considering spatial effects between stations

Libro de Actas CIT2016. XII Congreso de Ingeniería del Transporte ◽

10.4995/cit2016.2016.4053 ◽

2016 ◽

Author(s):

Rubén Cordera Piñera ◽

Roberto Sañudo ◽

Luigi Dell'Olio ◽

Ángel Ibeas

Keyword(s):

Environmental Sustainability ◽

Gravity Models ◽

Spatial Effects ◽

The European Union ◽

Distribution Models ◽

Railway Stations ◽

Ticket Sales ◽

Demand Prediction ◽

Passenger Demand ◽

Railway Infrastructure

The railways are a priority transport mode for the European Union given their safety record and environmental sustainability. Therefore it is important to have quantitative models available which allow passenger demand for rail travel to be simulated for planning purposes and to evaluate different policies. The aim of this article is to specify and estimate trip distribution models between railway stations by considering the most influential demand variables. Two types of models were estimated: Poisson regression and gravity. The input data were the ticket sales on a regional line in Cantabria (Spain) which were provided by the Spanish railway infrastructure administrator (ADIF – RAM). The models have also considered the possible existence of spatial effects between train stations. The results show that the models have a good fit to the available data, especial the gravity models constrained by origins and destinations. Furthermore, the gravity models which considered the existence of spatial effects between stations had a significantly better fit than the Poisson models and the gravity models that did not consider this phenomenon. The proposed models have therefore been shown to be good support tools for decision making in the field of railway planning.DOI: http://dx.doi.org/10.4995/CIT2016.2016.4053

Download Full-text

Exact and Approximate Algorithms for Computing Betweenness Centrality in Directed Graphs

Fundamenta Informaticae ◽

10.3233/fi-2021-2071 ◽

2021 ◽

Vol 182 (3) ◽

pp. 219-242

Author(s):

Mostafa Haghir Chehreghani ◽

Albert Bifet ◽

Talel Abdessalem

Keyword(s):

Betweenness Centrality ◽

Directed Graphs ◽

Randomized Algorithm ◽

Exact Algorithm ◽

Directed Network ◽

Weighted Graphs ◽

Single Vertex ◽

Small Constant ◽

Positive Weights ◽

Real World Datasets

Graphs (networks) are an important tool to model data in different domains. Realworld graphs are usually directed, where the edges have a direction and they are not symmetric. Betweenness centrality is an important index widely used to analyze networks. In this paper, first given a directed network G and a vertex r ∈ V (G), we propose an exact algorithm to compute betweenness score of r. Our algorithm pre-computes a set ℛ𝒱(r), which is used to prune a huge amount of computations that do not contribute to the betweenness score of r. Time complexity of our algorithm depends on |ℛ𝒱(r)| and it is respectively Θ(|ℛ𝒱(r)| · |E(G)|) and Θ(|ℛ𝒱(r)| · |E(G)| + |ℛ𝒱(r)| · |V(G)| log |V(G)|) for unweighted graphs and weighted graphs with positive weights. |ℛ𝒱(r)| is bounded from above by |V(G)| – 1 and in most cases, it is a small constant. Then, for the cases where ℛ𝒱(r) is large, we present a simple randomized algorithm that samples from ℛ𝒱(r) and performs computations for only the sampled elements. We show that this algorithm provides an (ɛ, δ)-approximation to the betweenness score of r. Finally, we perform extensive experiments over several real-world datasets from different domains for several randomly chosen vertices as well as for the vertices with the highest betweenness scores. Our experiments reveal that for estimating betweenness score of a single vertex, our algorithm significantly outperforms the most efficient existing randomized algorithms, in terms of both running time and accuracy. Our experiments also reveal that our algorithm improves the existing algorithms when someone is interested in computing betweenness values of the vertices in a set whose cardinality is very small.

Download Full-text

Continual representation learning for evolving biomedical bipartite networks

Bioinformatics ◽

10.1093/bioinformatics/btab067 ◽

2021 ◽

Author(s):

Kishlay Jha ◽

Guangxu Xun ◽

Aidong Zhang

Keyword(s):

Network Structure ◽

Learning Strategy ◽

Structure Learning ◽

Fundamental Problem ◽

Representation Learning ◽

Research Area ◽

Bipartite Network ◽

Bipartite Networks ◽

Straightforward Application ◽

Low Dimensional

Abstract Motivation Many real-world biomedical interactions such as ‘gene-disease’, ‘disease-symptom’ and ‘drug-target’ are modeled as a bipartite network structure. Learning meaningful representations for such networks is a fundamental problem in the research area of Network Representation Learning (NRL). NRL approaches aim to translate the network structure into low-dimensional vector representations that are useful to a variety of biomedical applications. Despite significant advances, the existing approaches still have certain limitations. First, a majority of these approaches do not model the unique topological properties of bipartite networks. Consequently, their straightforward application to the bipartite graphs yields unsatisfactory results. Second, the existing approaches typically learn representations from static networks. This is limiting for the biomedical bipartite networks that evolve at a rapid pace, and thus necessitate the development of approaches that can update the representations in an online fashion. Results In this research, we propose a novel representation learning approach that accurately preserves the intricate bipartite structure, and efficiently updates the node representations. Specifically, we design a customized autoencoder that captures the proximity relationship between nodes participating in the bipartite bicliques (2 × 2 sub-graph), while preserving both the global and local structures. Moreover, the proposed structure-preserving technique is carefully interleaved with the central tenets of continual machine learning to design an incremental learning strategy that updates the node representations in an online manner. Taken together, the proposed approach produces meaningful representations with high fidelity and computational efficiency. Extensive experiments conducted on several biomedical bipartite networks validate the effectiveness and rationality of the proposed approach.

Download Full-text

On the Training/Test Distributions Gap: A Data Representation Learning Framework

Dataset Shift in Machine Learning ◽

10.7551/mitpress/9780262170055.003.0005 ◽

2008 ◽

pp. 73-84

Author(s):

Ben-David Shai

Keyword(s):

Representation Learning ◽

Data Representation ◽

Learning Framework

Download Full-text

Exponential Family Graph Embeddings

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5737 ◽

2020 ◽

Vol 34 (04) ◽

pp. 3357-3364

Author(s):

Abdulkadir Celikkanat ◽

Fragkiskos D. Malliaros

Keyword(s):

Random Walk ◽

Exponential Family ◽

Representation Learning ◽

Learning Problems ◽

Interaction Patterns ◽

Network Representation ◽

Learning Tasks ◽

Learning Techniques ◽

Real World Datasets ◽

Low Dimensional

Representing networks in a low dimensional latent space is a crucial task with many interesting applications in graph learning problems, such as link prediction and node classification. A widely applied network representation learning paradigm is based on the combination of random walks for sampling context nodes and the traditional Skip-Gram model to capture center-context node relationships. In this paper, we emphasize on exponential family distributions to capture rich interaction patterns between nodes in random walk sequences. We introduce the generic exponential family graph embedding model, that generalizes random walk-based network representation learning techniques to exponential family conditional distributions. We study three particular instances of this model, analyzing their properties and showing their relationship to existing unsupervised learning models. Our experimental evaluation on real-world datasets demonstrates that the proposed techniques outperform well-known baseline methods in two downstream machine learning tasks.

Download Full-text