On spectral embedding performance and elucidating network structure in stochastic blockmodel graphs

Joshua Cape; Minh Tang; Carey E. Priebe

doi:10.1017/nws.2019.23

On spectral embedding performance and elucidating network structure in stochastic blockmodel graphs

Network Science ◽

10.1017/nws.2019.23 ◽

2019 ◽

Vol 7 (3) ◽

pp. 269-291 ◽

Cited By ~ 2

Author(s):

Joshua Cape ◽

Minh Tang ◽

Carey E. Priebe

Keyword(s):

Network Structure ◽

Graph Laplacian ◽

Comprehensive Treatment ◽

Sparse Graphs ◽

Graph Representations ◽

Spectral Embedding ◽

Stochastic Blockmodel ◽

Embedding Performance ◽

Low Dimensional ◽

Valued Graph

AbstractStatistical inference on graphs often proceeds via spectral methods involving low-dimensional embeddings of matrix-valued graph representations such as the graph Laplacian or adjacency matrix. In this paper, we analyze the asymptotic information-theoretic relative performance of Laplacian spectral embedding and adjacency spectral embedding for block assignment recovery in stochastic blockmodel graphs by way of Chernoff information. We investigate the relationship between spectral embedding performance and underlying network structure (e.g., homogeneity, affinity, core-periphery, and (un)balancedness) via a comprehensive treatment of the two-block stochastic blockmodel and the class of K-blockmodels exhibiting homogeneous balanced affinity structure. Our findings support the claim that, for a particular notion of sparsity, loosely speaking, “Laplacian spectral embedding favors relatively sparse graphs, whereas adjacency spectral embedding favors not-too-sparse graphs.” We also provide evidence in support of the claim that “adjacency spectral embedding favors core-periphery network structure.”

Download Full-text

Predicting partially observed processes on temporal networks by Dynamics-Aware Node Embeddings (DyANE)

EPJ Data Science ◽

10.1140/epjds/s13688-021-00277-8 ◽

2021 ◽

Vol 10 (1) ◽

Author(s):

Koya Sato ◽

Mizuki Oka ◽

Alain Barrat ◽

Ciro Cattuto

Keyword(s):

Network Structure ◽

Machine Learning Algorithms ◽

Temporal Networks ◽

Dynamical Processes ◽

Network Nodes ◽

Partially Observed ◽

Low Dimensional ◽

Vector Representations ◽

Infectious Disease Dynamics ◽

Random Times

AbstractLow-dimensional vector representations of network nodes have proven successful to feed graph data to machine learning algorithms and to improve performance across diverse tasks. Most of the embedding techniques, however, have been developed with the goal of achieving dense, low-dimensional encoding of network structure and patterns. Here, we present a node embedding technique aimed at providing low-dimensional feature vectors that are informative of dynamical processes occurring over temporal networks – rather than of the network structure itself – with the goal of enabling prediction tasks related to the evolution and outcome of these processes. We achieve this by using a lossless modified supra-adjacency representation of temporal networks and building on standard embedding techniques for static graphs based on random walks. We show that the resulting embedding vectors are useful for prediction tasks related to paradigmatic dynamical processes, namely epidemic spreading over empirical temporal networks. In particular, we illustrate the performance of our approach for the prediction of nodes’ epidemic states in single instances of a spreading process. We show how framing this task as a supervised multi-label classification task on the embedding vectors allows us to estimate the temporal evolution of the entire system from a partial sampling of nodes at random times, with potential impact for nowcasting infectious disease dynamics.

Download Full-text

Continual representation learning for evolving biomedical bipartite networks

Bioinformatics ◽

10.1093/bioinformatics/btab067 ◽

2021 ◽

Author(s):

Kishlay Jha ◽

Guangxu Xun ◽

Aidong Zhang

Keyword(s):

Network Structure ◽

Learning Strategy ◽

Structure Learning ◽

Fundamental Problem ◽

Representation Learning ◽

Research Area ◽

Bipartite Network ◽

Bipartite Networks ◽

Straightforward Application ◽

Low Dimensional

Abstract Motivation Many real-world biomedical interactions such as ‘gene-disease’, ‘disease-symptom’ and ‘drug-target’ are modeled as a bipartite network structure. Learning meaningful representations for such networks is a fundamental problem in the research area of Network Representation Learning (NRL). NRL approaches aim to translate the network structure into low-dimensional vector representations that are useful to a variety of biomedical applications. Despite significant advances, the existing approaches still have certain limitations. First, a majority of these approaches do not model the unique topological properties of bipartite networks. Consequently, their straightforward application to the bipartite graphs yields unsatisfactory results. Second, the existing approaches typically learn representations from static networks. This is limiting for the biomedical bipartite networks that evolve at a rapid pace, and thus necessitate the development of approaches that can update the representations in an online fashion. Results In this research, we propose a novel representation learning approach that accurately preserves the intricate bipartite structure, and efficiently updates the node representations. Specifically, we design a customized autoencoder that captures the proximity relationship between nodes participating in the bipartite bicliques (2 × 2 sub-graph), while preserving both the global and local structures. Moreover, the proposed structure-preserving technique is carefully interleaved with the central tenets of continual machine learning to design an incremental learning strategy that updates the node representations in an online manner. Taken together, the proposed approach produces meaningful representations with high fidelity and computational efficiency. Extensive experiments conducted on several biomedical bipartite networks validate the effectiveness and rationality of the proposed approach.

Download Full-text

A tractable latent variable model for nonlinear dimensionality reduction

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1916012117 ◽

2020 ◽

Vol 117 (27) ◽

pp. 15403-15408

Author(s):

Lawrence K. Saul

Keyword(s):

Latent Variables ◽

Latent Variable ◽

Linear Equations ◽

Three Dimensional ◽

Coarse Graining ◽

Graph Laplacian ◽

Latent Variable Model ◽

Nonlinear Dimensionality Reduction ◽

Variable Model ◽

Low Dimensional

We propose a latent variable model to discover faithful low-dimensional representations of high-dimensional data. The model computes a low-dimensional embedding that aims to preserve neighborhood relationships encoded by a sparse graph. The model both leverages and extends current leading approaches to this problem. Like t-distributed Stochastic Neighborhood Embedding, the model can produce two- and three-dimensional embeddings for visualization, but it can also learn higher-dimensional embeddings for other uses. Like LargeVis and Uniform Manifold Approximation and Projection, the model produces embeddings by balancing two goals—pulling nearby examples closer together and pushing distant examples further apart. Unlike these approaches, however, the latent variables in our model provide additional structure that can be exploited for learning. We derive an Expectation–Maximization procedure with closed-form updates that monotonically improve the model’s likelihood: In this procedure, embeddings are iteratively adapted by solving sparse, diagonally dominant systems of linear equations that arise from a discrete graph Laplacian. For large problems, we also develop an approximate coarse-graining procedure that avoids the need for negative sampling of nonadjacent nodes in the graph. We demonstrate the model’s effectiveness on datasets of images and text.

Download Full-text

Robust subspace methods for outlier detection in genomic data circumvents the curse of dimensionality

Royal Society Open Science ◽

10.1098/rsos.190714 ◽

2020 ◽

Vol 7 (2) ◽

pp. 190714 ◽

Cited By ~ 1

Author(s):

Omar Shetta ◽

Mahesan Niranjan

Keyword(s):

Outlier Detection ◽

Genomic Data ◽

Graph Laplacian ◽

Curse Of Dimensionality ◽

Low Rank ◽

Learning Problems ◽

Low Rank Approximation ◽

Inference Problems ◽

Reduction Techniques ◽

Low Dimensional

The application of machine learning to inference problems in biology is dominated by supervised learning problems of regression and classification, and unsupervised learning problems of clustering and variants of low-dimensional projections for visualization. A class of problems that have not gained much attention is detecting outliers in datasets, arising from reasons such as gross experimental, reporting or labelling errors. These could also be small parts of a dataset that are functionally distinct from the majority of a population. Outlier data are often identified by considering the probability density of normal data and comparing data likelihoods against some threshold. This classical approach suffers from the curse of dimensionality, which is a serious problem with omics data which are often found in very high dimensions. We develop an outlier detection method based on structured low-rank approximation methods. The objective function includes a regularizer based on neighbourhood information captured in the graph Laplacian. Results on publicly available genomic data show that our method robustly detects outliers whereas a density-based method fails even at moderate dimensions. Moreover, we show that our method has better clustering and visualization performance on the recovered low-dimensional projection when compared with popular dimensionality reduction techniques.

Download Full-text

Network-theoretic approach to sparsified discrete vortex dynamics

Journal of Fluid Mechanics ◽

10.1017/jfm.2015.97 ◽

2015 ◽

Vol 768 ◽

pp. 549-571 ◽

Cited By ~ 27

Author(s):

Aditya G. Nair ◽

Kunihiko Taira

Keyword(s):

Vortex Dynamics ◽

Point Vortex ◽

Spectral Graph Theory ◽

Theoretic Approach ◽

Dynamics Model ◽

Discrete Vortex ◽

Sparse Graphs ◽

Graph Representations ◽

Vortex Interactions ◽

Two Dimensional Flow

We examine discrete vortex dynamics in two-dimensional flow through a network-theoretic approach. The interaction of the vortices is represented with a graph, which allows the use of network-theoretic approaches to identify key vortex-to-vortex interactions. We employ sparsification techniques on these graph representations based on spectral theory to construct sparsified models and evaluate the dynamics of vortices in the sparsified set-up. Identification of vortex structures based on graph sparsification and sparse vortex dynamics is illustrated through an example of point-vortex clusters interacting amongst themselves. We also evaluate the performance of sparsification with increasing number of point vortices. The sparsified-dynamics model developed with spectral graph theory requires a reduced number of vortex-to-vortex interactions but agrees well with the full nonlinear dynamics. Furthermore, the sparsified model derived from the sparse graphs conserves the invariants of discrete vortex dynamics. We highlight the similarities and differences between the present sparsified-dynamics model and reduced-order models.

Download Full-text

Topic Modeling on Document Networks with Adjacent-Encoder

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6152 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6737-6745

Author(s):

Ce Zhang ◽

Hady W. Lauw

Keyword(s):

Network Structure ◽

Real World ◽

Topic Modeling ◽

Topic Model ◽

Web Pages ◽

Low Dimensional ◽

Textual Content

Oftentimes documents are linked to one another in a network structure,e.g., academic papers cite other papers, Web pages link to other pages. In this paper we propose a holistic topic model to learn meaningful and unified low-dimensional representations for networked documents that seek to preserve both textual content and network structure. On the basis of reconstructing not only the input document but also its adjacent neighbors, we develop two neural encoder architectures. Adjacent-Encoder, or AdjEnc, induces competition among documents for topic propagation, and reconstruction among neighbors for semantic capture. Adjacent-Encoder-X, or AdjEnc-X, extends this to also encode the network structure in addition to document content. We evaluate our models on real-world document networks quantitatively and qualitatively, outperforming comparable baselines comprehensively.

Download Full-text

Bayesian Network Structure Learning with Integer Programming: Polytopes, Facets and Complexity

Journal of Artificial Intelligence Research ◽

10.1613/jair.5203 ◽

2017 ◽

Vol 58 ◽

pp. 185-229 ◽

Cited By ~ 5

Author(s):

James Cussens ◽

Matti Järvisalo ◽

Janne H. Korhonen ◽

Mark Bartlett

Keyword(s):

Integer Programming ◽

Bayesian Network ◽

Graphical Models ◽

Network Structure ◽

Structure Learning ◽

Theoretical Perspective ◽

Separation Problem ◽

Bayesian Network Structure ◽

Bayesian Network Structure Learning ◽

Low Dimensional

The challenging task of learning structures of probabilistic graphical models is an important problem within modern AI research. Recent years have witnessed several major algorithmic advances in structure learning for Bayesian networks - arguably the most central class of graphical models - especially in what is known as the score-based setting. A successful generic approach to optimal Bayesian network structure learning (BNSL), based on integer programming (IP), is implemented in the GOBNILP system. Despite the recent algorithmic advances, current understanding of foundational aspects underlying the IP based approach to BNSL is still somewhat lacking. Understanding fundamental aspects of cutting planes and the related separation problem is important not only from a purely theoretical perspective, but also since it holds out the promise of further improving the efficiency of state-of-the-art approaches to solving BNSL exactly. In this paper, we make several theoretical contributions towards these goals: (i) we study the computational complexity of the separation problem, proving that the problem is NP-hard; (ii) we formalise and analyse the relationship between three key polytopes underlying the IP-based approach to BNSL; (iii) we study the facets of the three polytopes both from the theoretical and practical perspective, providing, via exhaustive computation, a complete enumeration of facets for low-dimensional family-variable polytopes; and, furthermore, (iv) we establish a tight connection of the BNSL problem to the acyclic subgraph problem.

Download Full-text

User Profile Preserving Social Network Embedding

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/472 ◽

2017 ◽

Cited By ~ 29

Author(s):

Daokun Zhang ◽

Jie Yin ◽

Xingquan Zhu ◽

Chengqi Zhang

Keyword(s):

Social Networks ◽

Social Network ◽

Network Structure ◽

Dimensional Space ◽

User Profile ◽

Network Embedding ◽

Network Nodes ◽

Profile Information ◽

Low Dimensional ◽

Performance Gains

This paper addresses social network embedding, which aims to embed social network nodes, including user profile information, into a latent low-dimensional space. Most of the existing works on network embedding only consider network structure, but ignore user-generated content that could be potentially helpful in learning a better joint network representation. Different from rich node content in citation networks, user profile information in social networks is useful but noisy, sparse, and incomplete. To properly utilize this information, we propose a new algorithm called User Profile Preserving Social Network Embedding (UPP-SNE), which incorporates user profile with network structure to jointly learn a vector representation of a social network. The theme of UPP-SNE is to embed user profile information via a nonlinear mapping into a consistent subspace, where network structure is seamlessly encoded to jointly learn informative node representations. Extensive experiments on four real-world social networks show that compared to state-of-the-art baselines, our method learns better social network representations and achieves substantial performance gains in node classification and clustering tasks.

Download Full-text

Evaluating predictive performance of network biomarkers with network structures

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720014500255 ◽

2014 ◽

Vol 12 (05) ◽

pp. 1450025 ◽

Cited By ~ 3

Author(s):

Shang Gao ◽

Ibrahim Karakira ◽

Salim Afra ◽

Ghada Naji ◽

Reda Alhajj ◽

...

Keyword(s):

Gene Expression ◽

Network Structure ◽

Quantitative Measure ◽

Predictive Performance ◽

Graph Laplacian ◽

Weight Coefficient ◽

Cancer Hallmarks ◽

Breast Cancer Biomarkers ◽

Coefficient Vector ◽

Marker Group

Network is a powerful structure which reveals valuable characteristics of the underlying data. However, previous work on evaluating the predictive performance of network-based biomarkers does not take nodal connectedness into account. We argue that it is necessary to maximize the benefit from the network structure by employing appropriate techniques. To address this, we aim to learn a weight coefficient for each node in the network from the quantitative measure such as gene expression data. The weight coefficients are computed from an optimization problem which minimizes the total weighted difference between nodes in a network structure; this can be expressed in terms of graph Laplacian. After obtaining the coefficient vector for the network markers, we can then compute the corresponding network predictor. We demonstrate the effectiveness of the proposed method by conducting experiments using published breast cancer biomarkers with three patient cohorts. Network markers are first grouped based on GO terms related to cancer hallmarks. We compare the predictive performance of each network marker group across gene expression datasets. We also evaluate the network predictor against the average method for feature aggregation. The reported results show that the predictive performance of network markers is generally not consistent across patient cohorts.

Download Full-text

Brain networks, dimensionality, and global signal averaging in resting-state fMRI: Hierarchical network structure results in low-dimensional spatiotemporal dynamics

NeuroImage ◽

10.1016/j.neuroimage.2019.116289 ◽

2020 ◽

Vol 205 ◽

pp. 116289 ◽

Cited By ~ 10

Author(s):

Stephen J. Gotts ◽

Adrian W. Gilmore ◽

Alex Martin

Keyword(s):

Network Structure ◽

Resting State ◽

Brain Networks ◽

Resting State Fmri ◽

Spatiotemporal Dynamics ◽

Signal Averaging ◽

Hierarchical Network ◽

Global Signal ◽

Low Dimensional

Download Full-text