scholarly journals Lumáwig: An Efficient Algorithm for Dimension Zero Bottleneck Distance Computation in Topological Data Analysis

Algorithms ◽  
2020 ◽  
Vol 13 (11) ◽  
pp. 291
Author(s):  
Paul Samuel Ignacio ◽  
Jay-Anne Bulauan ◽  
David Uminsky

Stability of persistence diagrams under slight perturbations is a key characteristic behind the validity and growing popularity of topological data analysis in exploring real-world data. Central to this stability is the use of Bottleneck distance which entails matching points between diagrams. Instances of use of this metric in practical studies have, however, been few and sparingly far between because of the computational obstruction, especially in dimension zero where the computational cost explodes with the growth of data size. We present a novel efficient algorithm to compute dimension zero bottleneck distance between two persistent diagrams of a specific kind which runs significantly faster and provides significantly sharper approximates with respect to the output of the original algorithm than any other available algorithm. We bypass the overwhelming matching problem in previous implementations of the bottleneck distance, and prove that the zero dimensional bottleneck distance can be recovered from a very small number of matching cases. Partly in keeping with nomenclature traditions in this area of TDA, we name this algorithm Lumáwig as a nod to a deity in the northern Philippines, where the algorithm was developed. We show that Lumáwig generally enjoys linear complexity as shown by empirical tests. We also present an application that leverages dimension zero persistence diagrams and the bottleneck distance to produce features for classification tasks.

Diagnostics ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 1322
Author(s):  
Gener José Avilés-Rodríguez ◽  
Juan Iván Nieto-Hipólito ◽  
María de los Ángeles Cosío-León ◽  
Gerardo Salvador Romo-Cárdenas ◽  
Juan de Dios Sánchez-López ◽  
...  

The objective of this work is to perform image quality assessment (IQA) of eye fundus images in the context of digital fundoscopy with topological data analysis (TDA) and machine learning methods. Eye health remains inaccessible for a large amount of the global population. Digital tools that automize the eye exam could be used to address this issue. IQA is a fundamental step in digital fundoscopy for clinical applications; it is one of the first steps in the preprocessing stages of computer-aided diagnosis (CAD) systems using eye fundus images. Images from the EyePACS dataset were used, and quality labels from previous works in the literature were selected. Cubical complexes were used to represent the images; the grayscale version was, then, used to calculate a persistent homology on the simplex and represented with persistence diagrams. Then, 30 vectorized topological descriptors were calculated from each image and used as input to a classification algorithm. Six different algorithms were tested for this study (SVM, decision tree, k-NN, random forest, logistic regression (LoGit), MLP). LoGit was selected and used for the classification of all images, given the low computational cost it carries. Performance results on the validation subset showed a global accuracy of 0.932, precision of 0.912 for label “quality” and 0.952 for label “no quality”, recall of 0.932 for label “quality” and 0.912 for label “no quality”, AUC of 0.980, F1 score of 0.932, and a Matthews correlation coefficient of 0.864. This work offers evidence for the use of topological methods for the process of quality assessment of eye fundus images, where a relatively small vector of characteristics (30 in this case) can enclose enough information for an algorithm to yield classification results useful in the clinical settings of a digital fundoscopy pipeline for CAD.


Author(s):  
Martin Cramer Pedersen ◽  
Vanessa Robins ◽  
Kell Mortensen ◽  
Jacob J. K. Kirkensgaard

Using methods from the field of topological data analysis, we investigate the self-assembly and emergence of three-dimensional quasi-crystalline structures in a single-component colloidal system. Combining molecular dynamics and persistent homology, we analyse the time evolution of persistence diagrams and particular local structural motifs. Our analysis reveals the formation and dissipation of specific particle constellations in these trajectories, and shows that the persistence diagrams are sensitive to nucleation and convergence to a final structure. Identification of local motifs allows quantification of the similarities between the final structures in a topological sense. This analysis reveals a continuous variation with density between crystalline clathrate, quasi-crystalline, and disordered phases quantified by ‘topological proximity’, a visualization of the Wasserstein distances between persistence diagrams. From a topological perspective, there is a subtle, but direct connection between quasi-crystalline, crystalline and disordered states. Our results demonstrate that topological data analysis provides detailed insights into molecular self-assembly.


2021 ◽  
Vol 0 (0) ◽  
pp. 0
Author(s):  
Christopher Oballe ◽  
Alan Cherne ◽  
Dave Boothe ◽  
Scott Kerick ◽  
Piotr J. Franaszczuk ◽  
...  

<p style='text-indent:20px;'>Topological data analysis encompasses a broad set of techniques that investigate the shape of data. One of the predominant tools in topological data analysis is persistent homology, which is used to create topological summaries of data called persistence diagrams. Persistent homology offers a novel method for signal analysis. Herein, we aid interpretation of the sublevel set persistence diagrams of signals by 1) showing the effect of frequency and instantaneous amplitude on the persistence diagrams for a family of deterministic signals, and 2) providing a general equation for the probability density of persistence diagrams of random signals via a pushforward measure. We also provide a topologically-motivated, efficiently computable statistical descriptor analogous to the power spectral density for signals based on a generalized Bayesian framework for persistence diagrams. This Bayesian descriptor is shown to be competitive with power spectral densities and continuous wavelet transforms at distinguishing signals with different dynamics in a classification problem with autoregressive signals.</p>


2021 ◽  
Vol 83 (3) ◽  
Author(s):  
Maria-Veronica Ciocanel ◽  
Riley Juenemann ◽  
Adriana T. Dawes ◽  
Scott A. McKinley

AbstractIn developmental biology as well as in other biological systems, emerging structure and organization can be captured using time-series data of protein locations. In analyzing this time-dependent data, it is a common challenge not only to determine whether topological features emerge, but also to identify the timing of their formation. For instance, in most cells, actin filaments interact with myosin motor proteins and organize into polymer networks and higher-order structures. Ring channels are examples of such structures that maintain constant diameters over time and play key roles in processes such as cell division, development, and wound healing. Given the limitations in studying interactions of actin with myosin in vivo, we generate time-series data of protein polymer interactions in cells using complex agent-based models. Since the data has a filamentous structure, we propose sampling along the actin filaments and analyzing the topological structure of the resulting point cloud at each time. Building on existing tools from persistent homology, we develop a topological data analysis (TDA) method that assesses effective ring generation in this dynamic data. This method connects topological features through time in a path that corresponds to emergence of organization in the data. In this work, we also propose methods for assessing whether the topological features of interest are significant and thus whether they contribute to the formation of an emerging hole (ring channel) in the simulated protein interactions. In particular, we use the MEDYAN simulation platform to show that this technique can distinguish between the actin cytoskeleton organization resulting from distinct motor protein binding parameters.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Scott Broderick ◽  
Ruhil Dongol ◽  
Tianmu Zhang ◽  
Krishna Rajan

AbstractThis paper introduces the use of topological data analysis (TDA) as an unsupervised machine learning tool to uncover classification criteria in complex inorganic crystal chemistries. Using the apatite chemistry as a template, we track through the use of persistent homology the topological connectivity of input crystal chemistry descriptors on defining similarity between different stoichiometries of apatites. It is shown that TDA automatically identifies a hierarchical classification scheme within apatites based on the commonality of the number of discrete coordination polyhedra that constitute the structural building units common among the compounds. This information is presented in the form of a visualization scheme of a barcode of homology classifications, where the persistence of similarity between compounds is tracked. Unlike traditional perspectives of structure maps, this new “Materials Barcode” schema serves as an automated exploratory machine learning tool that can uncover structural associations from crystal chemistry databases, as well as to achieve a more nuanced insight into what defines similarity among homologous compounds.


Sign in / Sign up

Export Citation Format

Share Document