scholarly journals Determining clinically relevant features in cytometry data using persistent homology

2021 ◽  
Author(s):  
Soham Mukherjee ◽  
Darren Wethington ◽  
Tamal K. Dey ◽  
Jayajit Das

AbstractCytometry experiments yield high-dimensional point cloud data that is difficult to interpret manually. Boolean gating techniques coupled with comparisons of relative abundances of cellular subsets is the current standard for cytometry data analysis. However, this approach is unable to capture more subtle topological features hidden in data, especially if those features are further masked by data transforms or significant batch effects or donor-to-donor variations in clinical data. We present that persistent homology, a mathematical structure that summarizes the topological features, can distinguish different sources of data, such as from groups of healthy donors or patients, effectively. Analysis of publicly available cytometry data describing non-naïve CD8+ T cells in COVID-19 patients and healthy controls shows that systematic structural differences exist between single cell protein expressions in COVID-19 patients and healthy controls.Our method identifies proteins of interest by a decision-tree based classifier and passes them to a kernel-density estimator (KDE) for sampling points from the density distribution. We then compute persistence diagrams from these sampled points. The resulting persistence diagrams identify regions in cytometry datasets of varying density and identify protruded structures such as ‘elbows’. We compute Wasserstein distances between these persistence diagrams for random pairs of healthy controls and COVID-19 patients and find that systematic structural differences exist between COVID-19 patients and healthy controls in the expression data for T-bet, Eomes, and Ki-67. Further analysis shows that expression of T-bet and Eomes are significantly downregulated in COVID-19 patient non-naïve CD8+ T cells compared to healthy controls. This counter-intuitive finding may indicate that canonical effector CD8+ T cells are less prevalent in COVID-19 patients than healthy controls. This method is applicable to any cytometry dataset for discovering novel insights through topological data analysis which may be difficult to ascertain otherwise with a standard gating strategy or in the presence of large batch effects.Author summaryIdentifying differences between cytometry data seen as a point cloud can be complicated by random variations in data collection and data sources. We apply persistent homology used in topological data analysis to describe the shape and structure of the data representing immune cells in healthy donors and COVID-19 patients. By looking at how the shape and structure differ between healthy donors and COVID-19 patients, we are able to definitively conclude how these groups differ despite random variations in the data. Furthermore, these results are novel in their ability to capture shape and structure of cytometry data, something not described by other analyses.

2021 ◽  
Vol 9 ◽  
Author(s):  
Peter Tsung-Wen Yen ◽  
Siew Ann Cheong

In recent years, persistent homology (PH) and topological data analysis (TDA) have gained increasing attention in the fields of shape recognition, image analysis, data analysis, machine learning, computer vision, computational biology, brain functional networks, financial networks, haze detection, etc. In this article, we will focus on stock markets and demonstrate how TDA can be useful in this regard. We first explain signatures that can be detected using TDA, for three toy models of topological changes. We then showed how to go beyond network concepts like nodes (0-simplex) and links (1-simplex), and the standard minimal spanning tree or planar maximally filtered graph picture of the cross correlations in stock markets, to work with faces (2-simplex) or any k-dim simplex in TDA. By scanning through a full range of correlation thresholds in a procedure called filtration, we were able to examine robust topological features (i.e. less susceptible to random noise) in higher dimensions. To demonstrate the advantages of TDA, we collected time-series data from the Straits Times Index and Taiwan Capitalization Weighted Stock Index (TAIEX), and then computed barcodes, persistence diagrams, persistent entropy, the bottleneck distance, Betti numbers, and Euler characteristic. We found that during the periods of market crashes, the homology groups become less persistent as we vary the characteristic correlation. For both markets, we found consistent signatures associated with market crashes in the Betti numbers, Euler characteristics, and persistent entropy, in agreement with our theoretical expectations.


2017 ◽  
Vol 5 (6) ◽  
pp. 884-892 ◽  
Author(s):  
Nicholas A Scoville ◽  
Karthik Yegnesh

Abstract Persistent homology has recently emerged as a powerful technique in topological data analysis for analysing the emergence and disappearance of topological features throughout a filtered space, shown via persistence diagrams. In this article, we develop an application of ideas from the theory of persistent homology and persistence diagrams to the study of data flow malfunctions in networks with a certain hierarchical structure. In particular, we formulate an algorithmic construction of persistence diagrams that parameterize network data flow errors, thus enabling novel applications of statistical methods that are traditionally used to assess the stability of persistence diagrams corresponding to homological data to the study of data flow malfunctions. We conclude with an application to network packet delivery systems.


2019 ◽  
Vol 3 (3) ◽  
pp. 695-706 ◽  
Author(s):  
Cameron T. Ellis ◽  
Michael Lesnick ◽  
Gregory Henselman-Petrusek ◽  
Bryn Keller ◽  
Jonathan D. Cohen

Recent fMRI research shows that perceptual and cognitive representations are instantiated in high-dimensional multivoxel patterns in the brain. However, the methods for detecting these representations are limited. Topological data analysis (TDA) is a new approach, based on the mathematical field of topology, that can detect unique types of geometric features in patterns of data. Several recent studies have successfully applied TDA to study various forms of neural data; however, to our knowledge, TDA has not been successfully applied to data from event-related fMRI designs. Event-related fMRI is very common but limited in terms of the number of events that can be run within a practical time frame and the effect size that can be expected. Here, we investigate whether persistent homology—a popular TDA tool that identifies topological features in data and quantifies their robustness—can identify known signals given these constraints. We use fmrisim, a Python-based simulator of realistic fMRI data, to assess the plausibility of recovering a simple topological representation under a variety of conditions. Our results suggest that persistent homology can be used under certain circumstances to recover topological structure embedded in realistic fMRI data simulations.


2000 ◽  
Vol 68 (12) ◽  
pp. 7144-7148 ◽  
Author(s):  
Steven M. Smith ◽  
Michèl R. Klein ◽  
Adam S. Malin ◽  
Jackson Sillah ◽  
Kris Huygen ◽  
...  

ABSTRACT Intracellular flow cytometry analysis of perforin production by CD8+ T cells showed levels were greatly reduced in tuberculosis (TB) patients compared to healthy controls. Reduced cytotoxic-T-lymphocyte activity was also obtained with CD8+T cells from TB patients compared to healthy controls in The Gambia. A change in antigen recognition was noted between the two groups of donors: in addition to recognition of Ag85A and Ag85B, as seen in healthy donors, a prominent ESAT-6 response was found in TB patients.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 3722-3722
Author(s):  
Semjon Willier ◽  
Paula Rothaemel ◽  
Jonas Wilhelm ◽  
Dana Stenger ◽  
Theresa Kaeuferle ◽  
...  

Abstract Introduction: Acute leukemia is the most common malignancy in children and develops within the bone marrow. Consequently, bone marrow derived T cells of leukemia patients can be defined as tumor infiltrating lymphocytes (TILs). Dysfunctional TILs have been described in several other malignancies. However, in pediatric patients the interaction between leukemic blasts and TILs remains largely unknown. In order to understand the impact of leukemic blasts on bone marrow T cells we profiled T cells in the bone marrow of pediatric leukemia patients by surface marker and transcriptome wide analysis. Methods: First, artificial changes in marker expression due to cryopreservation and thawing were ruled out (n=5). Then, cryopreserved bone marrow samples from both pediatric patients with acute leukemia (n= 77; BCP-ALL: 18, TCP-ALL: 23, AML: 36) and age-matched healthy bone marrow donors (HD, n=23) were identified in a local biobank. Multicolor flow cytometry was performed to quantify co-inhibitory markers on CD4 and CD8 T cells in primary (n=49) and relapse leukemia samples (n=28). As we could not detect surface CTLA4 expression on T cells, CTLA4 was stained intracellularly. Additionally, RNA-Seq on sorted bone marrow derived CD8 T cells (n=48; TCP-ALL: 12, AML: 20, HD: 16) was performed. Analysis of RNA-Seq data was based on Reads Per Kilobase Million (RPKM) normalization and False Discovery Rate (FDR, Benjamini-Hochberg) statistics. 172 differentially expressed genes were found when comparing bone marrow derived CD8 T cells from healthy donors (n=16) and leukemia patients (n=32) using the following criteria: RPKM>2 in both groups, fold change>2 and FDR<0.05). Results: The frequency of bone marrow T cells was reduced in patients with acute leukemia in comparison with healthy controls (5.9% vs. 24.4%, mean values, p<0.001). This reduction was more pronounced in BCP-ALL than in AML (0.9% vs. 8.4%, p<0.001). LAG3 and CTLA4 protein expression of T cells was increased in leukemia patients vs. healthy controls (LAG3: CD4: 2.6% vs. 0.7%, p<0.001; CD8: 8.6% vs. 2.2%, p<0.001; CTLA4: CD4: 7.3% vs. 3.8%, p=0.001; CD8: 1.2 vs. 0.3%, p<0.001). For CD8 T cells, those findings could be confirmed by RNA-Seq of sorted CD8 T cells (LAG3: 60.4 vs. 23.3 (RPKM), FDR=0.0044; CTLA4: 28.7 vs 4.7 (RPKM), FDR=0.046). Equally, TIM3 on T cells showed higher expression in leukemia patients vs. healthy controls (CD4: 3.7% vs. 1.3%, p=0.002; CD8: 8.5% vs. 3.3%, p<0.001). However, the same analysis of RNA-Seq data on sorted CD8 T cells did not yield a significant difference (18.1 vs. 5.6 (RPKM), FDR=0.29). PD1 was the only surface marker found to be more highly expressed in relapse samples than in primary diagnosis samples than in healthy controls (CD4: 42.3% vs. 28.9% vs. 19.8%, p<0.001; CD8: 45.2% vs. 33.3% vs. 26.5%, p=0.002). For CD8 T cells, RNA-Seq did not recapitulate this finding as no significant difference of PD1 transcript abundancy could be observed between leukemia patients and healthy donors by RNA-Seq (21.4 vs. 16.9 (RPKM), FDR=0.92). Finally, RNA-Seq on sorted CD8 T cells showed a pronounced overexpression of genes that are involved in the cytotoxic granule machinery in leukemia patients indicating an increase of effector phenotype in those cells. Contrarily, genes crucial for T cell function and memory formation were significantly downregulated in CD8 T cells from leukemia patients. Conclusion: By analyzing bone marrow samples from pediatric leukemia patients and healthy controls we confirm that bone marrow T cells of leukemia patients show signs of exhaustion compared to healthy individuals. Importantly, PD1 surface expression on T cells was identified as a marker that correlates with disease status (relapse > primary > healthy). A significant increase of exhaustion markers could be demonstrated both on protein and transcriptome level (LAG3, CTLA4) or on protein level only (TIM3, PD1). Moreover, we observed an increase of many elements of the cytotoxic granule machinery which is compatible with a loss of naïve/memory CD8 T cells. Additionally, genes essential for T cell memory formation were found to be downregulated in CD8 T cells from leukemia patients. These findings reflect an insufficient immune surveillance of pediatric leukemia by bone marrow T cells and may provide a rationale for future therapeutic interventions. Disclosures No relevant conflicts of interest to declare.


2019 ◽  
Vol 43 (6) ◽  
pp. 1021-1029 ◽  
Author(s):  
S.V. Eremeev ◽  
D.E. Andrianov ◽  
V.S. Titov

A problem of automatic comparison of spatial objects on maps with different scales for the same locality is considered in the article. It is proposed that this problem should be solved using methods of topological data analysis. The initial data of the algorithm are spatial objects that can be obtained from maps with different scales and subjected to deformations and distortions. Persistent homology allows us to identify the general structure of such objects in the form of topological features. The main topological features in the study are the connectivity components and holes in objects. The paper gives a mathematical description of the persistent homology method for representing spatial objects. A definition of a barcode for spatial data, which contains a description of the object in the form of topological features is given. An algorithm for comparing feature barcodes was developed. It allows us to find the general structure of objects. The algorithm is based on the analysis of data from the barcode. An index of objects similarity in terms of topological features is introduced. Results of the research of the algorithm for comparing maps of natural and municipal objects with different scales, generalization and deformation are shown. The experiments confirm the high quality of the proposed algorithm. The percentage of similarity in the comparison of natural objects, while taking into account the scale and deformation, is in the range from 85 to 92, and for municipal objects, after stretching and distortion of their parts, was from 74 to 87. Advantages of the proposed approach over analogues for the comparison of objects with significant deformation at different scales and after distortion are demonstrated.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Lili Sun ◽  
Shengyi Zou ◽  
Sisi Ding ◽  
Xuan Du ◽  
Yu Shen ◽  
...  

Aims. Obesity is highly associated with type 2 diabetes mellitus (T2DM). The TIM3/galectin-9 pathway plays an important role in immune tolerance. Herein, we aimed to investigate the expression of TIM3 and galectin-9 in peripheral blood and to evaluate their clinical significance in patients with obesity and obesity-related T2DM. Methods. We performed flow cytometry on peripheral blood samples from healthy donors (HC), patients with simple obesity (OB), and patients with obesity comorbid T2DM (OD). The expression of TIM3 on CD3+, CD4+, and CD8+ T cells was determined. The level of galectin-9 in plasma was detected by ELISA. Results. We demonstrated the enhancement of TIM3 on CD3+, CD4+, and CD8+ T cells in the OB group when compared with healthy controls, while it was decreased significantly in the OD group. The TIM3+CD8+ T cells of the OB group were positively correlated with risk factors including BMI, body fat rate, and hipline. The concentration of galectin-9 of the OD group in plasma was significantly higher than that of healthy donors and the OB group. Moreover, the level of galectin-9 of the OD group was positively correlated with fasting insulin and C-peptide, which were two clinical features that represented pancreatic islet function in T2DM. Conclusions. Our results suggested that TIM3 and galectin-9 may be potential biomarkers related to the pathogenesis of obesity-related T2DM.


Author(s):  
Martin Cramer Pedersen ◽  
Vanessa Robins ◽  
Kell Mortensen ◽  
Jacob J. K. Kirkensgaard

Using methods from the field of topological data analysis, we investigate the self-assembly and emergence of three-dimensional quasi-crystalline structures in a single-component colloidal system. Combining molecular dynamics and persistent homology, we analyse the time evolution of persistence diagrams and particular local structural motifs. Our analysis reveals the formation and dissipation of specific particle constellations in these trajectories, and shows that the persistence diagrams are sensitive to nucleation and convergence to a final structure. Identification of local motifs allows quantification of the similarities between the final structures in a topological sense. This analysis reveals a continuous variation with density between crystalline clathrate, quasi-crystalline, and disordered phases quantified by ‘topological proximity’, a visualization of the Wasserstein distances between persistence diagrams. From a topological perspective, there is a subtle, but direct connection between quasi-crystalline, crystalline and disordered states. Our results demonstrate that topological data analysis provides detailed insights into molecular self-assembly.


2018 ◽  
Author(s):  
Cameron T. Ellis ◽  
Michael Lesnick ◽  
Gregory Henselman-Petrusek ◽  
Bryn Keller ◽  
Jonathan D. Cohen

AbstractRecent fMRI research shows that perceptual and cognitive representations are instantiated in high-dimensional multi-voxel patterns in the brain. However, the methods for detecting these representations are limited. Topological Data Analysis (TDA) is a new approach, based on the mathematical field of topology, that can detect unique types of geometric features in patterns of data. Several recent studies have successfully applied TDA to study various forms of neural data; however, to our knowledge, TDA has not been successfully applied to data from event-related fMRI designs. Event-related fMRI is very common but limited in terms of the number of events that can be run within a practical time frame and the effect size that can be expected. Here, we investigate whether persistent homology — a popular TDA tool that identifies topological features in data and quantifies their robustness — can identify known signals given these constraints. We use fmrisim, a Python-based simulator of realistic fMRI data, to assess the plausibility of recovering a simple topological representation under a variety of conditions. Our results suggest that persistent homology can be used under certain circumstances to recover topological structure embedded in realistic fMRI data simulations.


2019 ◽  
Vol 3 (3) ◽  
pp. 656-673 ◽  
Author(s):  
Ann E. Sizemore ◽  
Jennifer E. Phillips-Cremins ◽  
Robert Ghrist ◽  
Danielle S. Bassett

Data analysis techniques from network science have fundamentally improved our understanding of neural systems and the complex behaviors that they support. Yet the restriction of network techniques to the study of pairwise interactions prevents us from taking into account intrinsic topological features such as cavities that may be crucial for system function. To detect and quantify these topological features, we must turn to algebro-topological methods that encode data as a simplicial complex built from sets of interacting nodes called simplices. We then use the relations between simplices to expose cavities within the complex, thereby summarizing its topological features. Here we provide an introduction to persistent homology, a fundamental method from applied topology that builds a global descriptor of system structure by chronicling the evolution of cavities as we move through a combinatorial object such as a weighted network. We detail the mathematics and perform demonstrative calculations on the mouse structural connectome, synapses in C. elegans, and genomic interaction data. Finally, we suggest avenues for future work and highlight new advances in mathematics ready for use in neural systems.


Sign in / Sign up

Export Citation Format

Share Document