A persistent homological analysis of network data flow malfunctions

2017 ◽  
Vol 5 (6) ◽  
pp. 884-892 ◽  
Author(s):  
Nicholas A Scoville ◽  
Karthik Yegnesh

Abstract Persistent homology has recently emerged as a powerful technique in topological data analysis for analysing the emergence and disappearance of topological features throughout a filtered space, shown via persistence diagrams. In this article, we develop an application of ideas from the theory of persistent homology and persistence diagrams to the study of data flow malfunctions in networks with a certain hierarchical structure. In particular, we formulate an algorithmic construction of persistence diagrams that parameterize network data flow errors, thus enabling novel applications of statistical methods that are traditionally used to assess the stability of persistence diagrams corresponding to homological data to the study of data flow malfunctions. We conclude with an application to network packet delivery systems.

2021 ◽  
Author(s):  
Soham Mukherjee ◽  
Darren Wethington ◽  
Tamal K. Dey ◽  
Jayajit Das

AbstractCytometry experiments yield high-dimensional point cloud data that is difficult to interpret manually. Boolean gating techniques coupled with comparisons of relative abundances of cellular subsets is the current standard for cytometry data analysis. However, this approach is unable to capture more subtle topological features hidden in data, especially if those features are further masked by data transforms or significant batch effects or donor-to-donor variations in clinical data. We present that persistent homology, a mathematical structure that summarizes the topological features, can distinguish different sources of data, such as from groups of healthy donors or patients, effectively. Analysis of publicly available cytometry data describing non-naïve CD8+ T cells in COVID-19 patients and healthy controls shows that systematic structural differences exist between single cell protein expressions in COVID-19 patients and healthy controls.Our method identifies proteins of interest by a decision-tree based classifier and passes them to a kernel-density estimator (KDE) for sampling points from the density distribution. We then compute persistence diagrams from these sampled points. The resulting persistence diagrams identify regions in cytometry datasets of varying density and identify protruded structures such as ‘elbows’. We compute Wasserstein distances between these persistence diagrams for random pairs of healthy controls and COVID-19 patients and find that systematic structural differences exist between COVID-19 patients and healthy controls in the expression data for T-bet, Eomes, and Ki-67. Further analysis shows that expression of T-bet and Eomes are significantly downregulated in COVID-19 patient non-naïve CD8+ T cells compared to healthy controls. This counter-intuitive finding may indicate that canonical effector CD8+ T cells are less prevalent in COVID-19 patients than healthy controls. This method is applicable to any cytometry dataset for discovering novel insights through topological data analysis which may be difficult to ascertain otherwise with a standard gating strategy or in the presence of large batch effects.Author summaryIdentifying differences between cytometry data seen as a point cloud can be complicated by random variations in data collection and data sources. We apply persistent homology used in topological data analysis to describe the shape and structure of the data representing immune cells in healthy donors and COVID-19 patients. By looking at how the shape and structure differ between healthy donors and COVID-19 patients, we are able to definitively conclude how these groups differ despite random variations in the data. Furthermore, these results are novel in their ability to capture shape and structure of cytometry data, something not described by other analyses.


2021 ◽  
Vol 9 ◽  
Author(s):  
Peter Tsung-Wen Yen ◽  
Siew Ann Cheong

In recent years, persistent homology (PH) and topological data analysis (TDA) have gained increasing attention in the fields of shape recognition, image analysis, data analysis, machine learning, computer vision, computational biology, brain functional networks, financial networks, haze detection, etc. In this article, we will focus on stock markets and demonstrate how TDA can be useful in this regard. We first explain signatures that can be detected using TDA, for three toy models of topological changes. We then showed how to go beyond network concepts like nodes (0-simplex) and links (1-simplex), and the standard minimal spanning tree or planar maximally filtered graph picture of the cross correlations in stock markets, to work with faces (2-simplex) or any k-dim simplex in TDA. By scanning through a full range of correlation thresholds in a procedure called filtration, we were able to examine robust topological features (i.e. less susceptible to random noise) in higher dimensions. To demonstrate the advantages of TDA, we collected time-series data from the Straits Times Index and Taiwan Capitalization Weighted Stock Index (TAIEX), and then computed barcodes, persistence diagrams, persistent entropy, the bottleneck distance, Betti numbers, and Euler characteristic. We found that during the periods of market crashes, the homology groups become less persistent as we vary the characteristic correlation. For both markets, we found consistent signatures associated with market crashes in the Betti numbers, Euler characteristics, and persistent entropy, in agreement with our theoretical expectations.


2021 ◽  
Author(s):  
◽  
Seungho Choe

Persistent homology is a powerful tool in topological data analysis (TDA) to compute, study and encode efficiently multi-scale topological features and is being increasingly used in digital image classification. The topological features represent number of connected components, cycles, and voids that describe the shape of data. Persistent homology extracts the birth and death of these topological features through a filtration process. The lifespan of these features can represented using persistent diagrams (topological signatures). Cubical homology is a more efficient method for extracting topological features from a 2D image and uses a collection of cubes to compute the homology, which fits the digital image structure of grids. In this research, we propose a cubical homology-based algorithm for extracting topological features from 2D images to generate their topological signatures. Additionally, we propose a score, which measures the significance of each of the sub-simplices in terms of persistence. Also, gray level co-occurrence matrix (GLCM) and contrast limited adapting histogram equalization (CLAHE) are used as a supplementary method for extracting features. Machine learning techniques are then employed to classify images using the topological signatures. Among the eight tested algorithms with six published image datasets with varying pixel sizes, classes, and distributions, our experiments demonstrate that cubical homology-based machine learning with deep residual network (ResNet 1D) and Light Gradient Boosting Machine (lightGBM) shows promise with the extracted topological features.


2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Ali Nabi Duman ◽  
Harun Pirim

Persistent homology, a topological data analysis (TDA) method, is applied to microarray data sets. Although there are a few papers referring to TDA methods in microarray analysis, the usage of persistent homology in the comparison of several weighted gene coexpression networks (WGCN) was not employed before to the very best of our knowledge. We calculate the persistent homology of weighted networks constructed from 38 Arabidopsis microarray data sets to test the relevance and the success of this approach in distinguishing the stress factors. We quantify multiscale topological features of each network using persistent homology and apply a hierarchical clustering algorithm to the distance matrix whose entries are pairwise bottleneck distance between the networks. The immunoresponses to different stress factors are distinguishable by our method. The networks of similar immunoresponses are found to be close with respect to bottleneck distance indicating the similar topological features of WGCNs. This computationally efficient technique analyzing networks provides a quick test for advanced studies.


Author(s):  
Firas A. Khasawneh ◽  
Elizabeth Munch

This paper describes a new approach for ascertaining the stability of autonomous stochastic delay equations in their parameter space by examining their time series using topological data analysis. We use a nonlinear model that describes the tool oscillations due to self-excited vibrations in turning. The time series is generated using Euler-Maruyama method and then is turned into a point cloud in a high dimensional Euclidean space using the delay embedding. The point cloud can then be analyzed using persistent homology. Specifically, in the deterministic case, the system has a stable fixed point while the loss of stability is associated with Hopf bifurcation whereby a limit cycle branches from the fixed point. Since periodicity in the signal translates into circularity in the point cloud, the persistence diagram associated to the periodic time series will have a high persistence point. This can be used to determine a threshold criteria that can automatically classify the system behavior based on its time series. The results of this study show that the described approach can be used for analyzing datasets of delay dynamical systems generated both from numerical simulation and experimental data.


PLoS ONE ◽  
2021 ◽  
Vol 16 (9) ◽  
pp. e0257215
Author(s):  
Renata Turkeš ◽  
Jannes Nys ◽  
Tim Verdonck ◽  
Steven Latré

Topological data analysis is a recent and fast growing field that approaches the analysis of datasets using techniques from (algebraic) topology. Its main tool, persistent homology (PH), has seen a notable increase in applications in the last decade. Often cited as the most favourable property of PH and the main reason for practical success are the stability theorems that give theoretical results about noise robustness, since real data is typically contaminated with noise or measurement errors. However, little attention has been paid to what these stability theorems mean in practice. To gain some insight into this question, we evaluate the noise robustness of PH on the MNIST dataset of greyscale images. More precisely, we investigate to what extent PH changes under typical forms of image noise, and quantify the loss of performance in classifying the MNIST handwritten digits when noise is added to the data. The results show that the sensitivity to noise of PH is influenced by the choice of filtrations and persistence signatures (respectively the input and output of PH), and in particular, that PH features are often not robust to noise in a classification task.


2022 ◽  
Author(s):  
Matthew Bailey ◽  
Mark Wilson

One of the critical tools of persistent homology is the persistence diagram. We demonstrate the applicability of a persistence diagram showing the existence of topological features (here rings in a 2D network) generated over time instead of space as a tool to analyse trajectories of biological networks. We show how the time persistence diagram is useful in order to identify critical phenomena such as rupturing and to visualise important features in 2D biological networks; they are particularly useful to highlight patterns of damage and to identify if particular patterns are significant or ephemeral. Persistence diagrams are also used to analyse repair phenomena, and we explore how the measured properties of a dynamical phenomenon change according to the sampling frequency. This shows that the persistence diagrams are robust and still provide useful information even for data of low temporal resolution. Finally, we combine persistence diagrams across many trajectories to show how the technique highlights the existence of sharp transitions at critical points in the rupturing process.


2020 ◽  
Vol 12 (10) ◽  
pp. 3985
Author(s):  
Nur Fariha Syaqina Zulkepli ◽  
Mohd Salmi Md Noorani ◽  
Fatimah Abdul Razak ◽  
Munira Ismail ◽  
Mohd Almie Alias

Severe haze episodes have periodically occurred in Southeast Asia, specifically taunting Malaysia with adverse effects. A technique called cluster analysis was used to analyze these occurrences. Traditional cluster analysis, in particular, hierarchical agglomerative cluster analysis (HACA), was applied directly to data sets. The data sets may contain hidden patterns that can be explored. In this paper, this underlying information was captured via persistent homology, a topological data analysis (TDA) tool, which extracts topological features including components, holes, and cavities in the data sets. In particular, an improved version of HACA was proposed by combining HACA and persistent homology. Additionally, a comparative study between traditional HACA and improved HACA was done using particulate matter data, which was the major pollutant found during haze episodes by the Klang, Petaling Jaya, and Shah Alam air quality monitoring stations. The effectiveness of these two clustering approaches was evaluated based on their ability to cluster the months according to the haze condition. The results showed that clustering based on topological features via the improved HACA approach was able to correctly group the months with severe haze compared to clustering them without such features, and these results were consistent for all three locations.


2019 ◽  
Vol 3 (3) ◽  
pp. 695-706 ◽  
Author(s):  
Cameron T. Ellis ◽  
Michael Lesnick ◽  
Gregory Henselman-Petrusek ◽  
Bryn Keller ◽  
Jonathan D. Cohen

Recent fMRI research shows that perceptual and cognitive representations are instantiated in high-dimensional multivoxel patterns in the brain. However, the methods for detecting these representations are limited. Topological data analysis (TDA) is a new approach, based on the mathematical field of topology, that can detect unique types of geometric features in patterns of data. Several recent studies have successfully applied TDA to study various forms of neural data; however, to our knowledge, TDA has not been successfully applied to data from event-related fMRI designs. Event-related fMRI is very common but limited in terms of the number of events that can be run within a practical time frame and the effect size that can be expected. Here, we investigate whether persistent homology—a popular TDA tool that identifies topological features in data and quantifies their robustness—can identify known signals given these constraints. We use fmrisim, a Python-based simulator of realistic fMRI data, to assess the plausibility of recovering a simple topological representation under a variety of conditions. Our results suggest that persistent homology can be used under certain circumstances to recover topological structure embedded in realistic fMRI data simulations.


2019 ◽  
Vol 43 (6) ◽  
pp. 1021-1029 ◽  
Author(s):  
S.V. Eremeev ◽  
D.E. Andrianov ◽  
V.S. Titov

A problem of automatic comparison of spatial objects on maps with different scales for the same locality is considered in the article. It is proposed that this problem should be solved using methods of topological data analysis. The initial data of the algorithm are spatial objects that can be obtained from maps with different scales and subjected to deformations and distortions. Persistent homology allows us to identify the general structure of such objects in the form of topological features. The main topological features in the study are the connectivity components and holes in objects. The paper gives a mathematical description of the persistent homology method for representing spatial objects. A definition of a barcode for spatial data, which contains a description of the object in the form of topological features is given. An algorithm for comparing feature barcodes was developed. It allows us to find the general structure of objects. The algorithm is based on the analysis of data from the barcode. An index of objects similarity in terms of topological features is introduced. Results of the research of the algorithm for comparing maps of natural and municipal objects with different scales, generalization and deformation are shown. The experiments confirm the high quality of the proposed algorithm. The percentage of similarity in the comparison of natural objects, while taking into account the scale and deformation, is in the range from 85 to 92, and for municipal objects, after stretching and distortion of their parts, was from 74 to 87. Advantages of the proposed approach over analogues for the comparison of objects with significant deformation at different scales and after distortion are demonstrated.


Sign in / Sign up

Export Citation Format

Share Document