scholarly journals Unfolding the multiscale structure of networks with dynamical Ollivier-Ricci curvature

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Adam Gosztolai ◽  
Alexis Arnaudon

AbstractDescribing networks geometrically through low-dimensional latent metric spaces has helped design efficient learning algorithms, unveil network symmetries and study dynamical network processes. However, latent space embeddings are limited to specific classes of networks because incompatible metric spaces generally result in information loss. Here, we study arbitrary networks geometrically by defining a dynamic edge curvature measuring the similarity between pairs of dynamical network processes seeded at nearby nodes. We show that the evolution of the curvature distribution exhibits gaps at characteristic timescales indicating bottleneck-edges that limit information spreading. Importantly, curvature gaps are robust to large fluctuations in node degrees, encoding communities until the phase transition of detectability, where spectral and node-clustering methods fail. Using this insight, we derive geometric modularity to find multiscale communities based on deviations from constant network curvature in generative and real-world networks, significantly outperforming most previous methods. Our work suggests using network geometry for studying and controlling the structure of and information spreading on networks.

2021 ◽  
Author(s):  
Adam Gosztolai ◽  
Alexis Arnaudon

Abstract Defining the geometry of networks is typically associated with embedding in low-dimensional spaces such as manifolds. This approach has helped design efficient learning algorithms, unveil network symmetries and study dynamical network processes. However, the choice of embedding space is network-specific, and incompatible spaces can result in information loss. Here, we define a dynamic edge curvature for the study of arbitrary networks measuring the similarity between pairs of dynamical network processes seeded at nearby nodes. We show that the evolution of the curvature distribution exhibits gaps at characteristic timescales indicating bottleneck-edges that limit information spreading. Importantly, curvature gaps robustly encode communities until the phase transition of detectability, where spectral clustering methods fail. We use this insight to derive geometric modularity optimisation and demonstrate it on the European power grid and the C. elegans homeobox gene regulatory network finding previously unidentified communities on multiple scales. Our work suggests using network geometry for studying and controlling the structure of and information spreading on networks.


2021 ◽  
Author(s):  
Arnaud Mounier ◽  
Laure Raynaud ◽  
Lucie Rottner ◽  
Matthieu Plu

<p>The use of ensemble prediction systems (EPS) is challenging because of the huge information it provides. Forecasts from ensemble prediction systems (EPS) are often summarised by statistical quantities (ie quantiles maps). Although such mathematical representation is efficient for capturing the ensemble distribution, it lacks physical consistency, which raises issues for many applications of EPS in an operational context. In order to provide a physically-consistent synthesis of the French convection-permitting AROME-EPS forecasts, we propose to automatically draw a few scenarios that are representative of the different possible outcomes. Each scenario is a reduced set of EPS members.</p><p>To design a scenario synthesis, the procedure can be divided into two parts. A first step aims at extracting relevant features in each EPS member in order to reduce the problem dimensionality. Then, a clustering is done based on these features.</p><p>The originality of our work is to leverage the capacities of deep learning for the features extraction. For that purpose, we use a convolutional autoencodeur (CAE) to learn an optimal low-dimensional representation (also called latent space representation) of the input forecast field. In this work, the algorithm is developed to work on 1h-accumulated rainfall from AROME-EPS, with a focus on convective cases.</p><p>The CAE is trained on around 150 000 forecasts and its performance is evaluated based on the quality of the reconstructed input fields from the latent space. To examine the reconstruction quality, an object-oriented approach is used. CAE is also compared with the commonly-used principal component analysis (PCA). In a second part, different clustering methods (kmeans, HDBSCAN, …) are applied to EPS members in the latent space and evaluated using subjective and objective diagnostics.</p>


2021 ◽  
Vol 13 (2) ◽  
pp. 51
Author(s):  
Lili Sun ◽  
Xueyan Liu ◽  
Min Zhao ◽  
Bo Yang

Variational graph autoencoder, which can encode structural information and attribute information in the graph into low-dimensional representations, has become a powerful method for studying graph-structured data. However, most existing methods based on variational (graph) autoencoder assume that the prior of latent variables obeys the standard normal distribution which encourages all nodes to gather around 0. That leads to the inability to fully utilize the latent space. Therefore, it becomes a challenge on how to choose a suitable prior without incorporating additional expert knowledge. Given this, we propose a novel noninformative prior-based interpretable variational graph autoencoder (NPIVGAE). Specifically, we exploit the noninformative prior as the prior distribution of latent variables. This prior enables the posterior distribution parameters to be almost learned from the sample data. Furthermore, we regard each dimension of a latent variable as the probability that the node belongs to each block, thereby improving the interpretability of the model. The correlation within and between blocks is described by a block–block correlation matrix. We compare our model with state-of-the-art methods on three real datasets, verifying its effectiveness and superiority.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Stefano Recanatesi ◽  
Matthew Farrell ◽  
Guillaume Lajoie ◽  
Sophie Deneve ◽  
Mattia Rigotti ◽  
...  

AbstractArtificial neural networks have recently achieved many successes in solving sequential processing and planning tasks. Their success is often ascribed to the emergence of the task’s low-dimensional latent structure in the network activity – i.e., in the learned neural representations. Here, we investigate the hypothesis that a means for generating representations with easily accessed low-dimensional latent structure, possibly reflecting an underlying semantic organization, is through learning to predict observations about the world. Specifically, we ask whether and when network mechanisms for sensory prediction coincide with those for extracting the underlying latent variables. Using a recurrent neural network model trained to predict a sequence of observations we show that network dynamics exhibit low-dimensional but nonlinearly transformed representations of sensory inputs that map the latent structure of the sensory environment. We quantify these results using nonlinear measures of intrinsic dimensionality and linear decodability of latent variables, and provide mathematical arguments for why such useful predictive representations emerge. We focus throughout on how our results can aid the analysis and interpretation of experimental data.


2021 ◽  
Vol 11 (3) ◽  
pp. 1013
Author(s):  
Zvezdan Lončarević ◽  
Rok Pahič ◽  
Aleš Ude ◽  
Andrej Gams

Autonomous robot learning in unstructured environments often faces the problem that the dimensionality of the search space is too large for practical applications. Dimensionality reduction techniques have been developed to address this problem and describe motor skills in low-dimensional latent spaces. Most of these techniques require the availability of a sufficiently large database of example task executions to compute the latent space. However, the generation of many example task executions on a real robot is tedious, and prone to errors and equipment failures. The main result of this paper is a new approach for efficient database gathering by performing a small number of task executions with a real robot and applying statistical generalization, e.g., Gaussian process regression, to generate more data. We have shown in our experiments that the data generated this way can be used for dimensionality reduction with autoencoder neural networks. The resulting latent spaces can be exploited to implement robot learning more efficiently. The proposed approach has been evaluated on the problem of robotic throwing at a target. Simulation and real-world results with a humanoid robot TALOS are provided. They confirm the effectiveness of generalization-based database acquisition and the efficiency of learning in a low-dimensional latent space.


2021 ◽  
Vol 15 ◽  
pp. 174830262110249
Author(s):  
Cong-Zhe You ◽  
Zhen-Qiu Shu ◽  
Hong-Hui Fan

Recently, in the area of artificial intelligence and machine learning, subspace clustering of multi-view data is a research hotspot. The goal is to divide data samples from different sources into different groups. We proposed a new subspace clustering method for multi-view data which termed as Non-negative Sparse Laplacian regularized Latent Multi-view Subspace Clustering (NSL2MSC) in this paper. The method proposed in this paper learns the latent space representation of multi view data samples, and performs the data reconstruction on the latent space. The algorithm can cluster data in the latent representation space and use the relationship of different views. However, the traditional representation-based method does not consider the non-linear geometry inside the data, and may lose the local and similar information between the data in the learning process. By using the graph regularization method, we can not only capture the global low dimensional structural features of data, but also fully capture the nonlinear geometric structure information of data. The experimental results show that the proposed method is effective and its performance is better than most of the existing alternatives.


2020 ◽  
Author(s):  
Mohit Goyal ◽  
Guillermo Serrano ◽  
Ilan Shomorony ◽  
Mikel Hernaez ◽  
Idoia Ochoa

AbstractSingle-cell RNA-seq is a powerful tool in the study of the cellular composition of different tissues and organisms. A key step in the analysis pipeline is the annotation of cell-types based on the expression of specific marker genes. Since manual annotation is labor-intensive and does not scale to large datasets, several methods for automated cell-type annotation have been proposed based on supervised learning. However, these methods generally require feature extraction and batch alignment prior to classification, and their performance may become unreliable in the presence of cell-types with very similar transcriptomic profiles, such as differentiating cells. We propose JIND, a framework for automated cell-type identification based on neural networks that directly learns a low-dimensional representation (latent code) in which cell-types can be reliably determined. To account for batch effects, JIND performs a novel asymmetric alignment in which the transcriptomic profile of unseen cells is mapped onto the previously learned latent space, hence avoiding the need of retraining the model whenever a new dataset becomes available. JIND also learns cell-type-specific confidence thresholds to identify and reject cells that cannot be reliably classified. We show on datasets with and without batch effects that JIND classifies cells more accurately than previously proposed methods while rejecting only a small proportion of cells. Moreover, JIND batch alignment is parallelizable, being more than five or six times faster than Seurat integration. Availability: https://github.com/mohit1997/JIND.


2019 ◽  
Vol 2019 ◽  
pp. 1-9
Author(s):  
Wanyi Li ◽  
Feifei Zhang ◽  
Qiang Chen ◽  
Qian Zhang

It is a difficult task to estimate the human transition motion without the specialized software. The 3-dimensional (3D) human motion animation is widely used in video game, movie, and so on. When making the animation, human transition motion is necessary. If there is a method that can generate the transition motion, the making time will cost less and the working efficiency will be improved. Thus a new method called latent space optimization based on projection analysis (LSOPA) is proposed to estimate the human transition motion. LSOPA is carried out under the assistance of Gaussian process dynamical models (GPDM); it builds the object function to optimize the data in the low dimensional (LD) space, and the optimized data in LD space will be obtained to generate the human transition motion. The LSOPA can make the GPDM learn the high dimensional (HD) data to estimate the needed transition motion. The excellent performance of LSOPA will be tested by the experiments.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Yoshihiro Nagano ◽  
Ryo Karakida ◽  
Masato Okada

Abstract Deep neural networks are good at extracting low-dimensional subspaces (latent spaces) that represent the essential features inside a high-dimensional dataset. Deep generative models represented by variational autoencoders (VAEs) can generate and infer high-quality datasets, such as images. In particular, VAEs can eliminate the noise contained in an image by repeating the mapping between latent and data space. To clarify the mechanism of such denoising, we numerically analyzed how the activity pattern of trained networks changes in the latent space during inference. We considered the time development of the activity pattern for specific data as one trajectory in the latent space and investigated the collective behavior of these inference trajectories for many data. Our study revealed that when a cluster structure exists in the dataset, the trajectory rapidly approaches the center of the cluster. This behavior was qualitatively consistent with the concept retrieval reported in associative memory models. Additionally, the larger the noise contained in the data, the closer the trajectory was to a more global cluster. It was demonstrated that by increasing the number of the latent variables, the trend of the approach a cluster center can be enhanced, and the generalization ability of the VAE can be improved.


2015 ◽  
Vol 27 (9) ◽  
pp. 1825-1856 ◽  
Author(s):  
Karthik C. Lakshmanan ◽  
Patrick T. Sadtler ◽  
Elizabeth C. Tyler-Kabara ◽  
Aaron P. Batista ◽  
Byron M. Yu

Noisy, high-dimensional time series observations can often be described by a set of low-dimensional latent variables. Commonly used methods to extract these latent variables typically assume instantaneous relationships between the latent and observed variables. In many physical systems, changes in the latent variables manifest as changes in the observed variables after time delays. Techniques that do not account for these delays can recover a larger number of latent variables than are present in the system, thereby making the latent representation more difficult to interpret. In this work, we introduce a novel probabilistic technique, time-delay gaussian-process factor analysis (TD-GPFA), that performs dimensionality reduction in the presence of a different time delay between each pair of latent and observed variables. We demonstrate how using a gaussian process to model the evolution of each latent variable allows us to tractably learn these delays over a continuous domain. Additionally, we show how TD-GPFA combines temporal smoothing and dimensionality reduction into a common probabilistic framework. We present an expectation/conditional maximization either (ECME) algorithm to learn the model parameters. Our simulations demonstrate that when time delays are present, TD-GPFA is able to correctly identify these delays and recover the latent space. We then applied TD-GPFA to the activity of tens of neurons recorded simultaneously in the macaque motor cortex during a reaching task. TD-GPFA is able to better describe the neural activity using a more parsimonious latent space than GPFA, a method that has been used to interpret motor cortex data but does not account for time delays. More broadly, TD-GPFA can help to unravel the mechanisms underlying high-dimensional time series data by taking into account physical delays in the system.


Sign in / Sign up

Export Citation Format

Share Document