scholarly journals A topological characterization of DNA sequences based on chaos geometry and persistent homology

2021 ◽  
Author(s):  
Dong Quan Ngoc Nguyen ◽  
Phuong Dong Tan Le ◽  
Lin Xing ◽  
Lizhen Lin

AbstractMethods for analyzing similarities among DNA sequences play a fundamental role in computational biology, and have a variety of applications in public health, and in the field of genetics. In this paper, a novel geometric and topological method for analyzing similarities among DNA sequences is developed, based on persistent homology from algebraic topology, in combination with chaos geometry in 4-dimensional space as a graphical representation of DNA sequences. Our topological framework for DNA similarity analysis is general, alignment-free, and can deal with DNA sequences of various lengths, while proving first-of-the-kind visualization features for visual inspection of DNA sequences directly, based on topological features of point clouds that represent DNA sequences. As an application, we test our methods on three datasets including genome sequences of different types of Hantavirus, Influenza A viruses, and Human Papillomavirus.

2021 ◽  
Author(s):  
Dong Quan Ngoc Nguyen ◽  
Phuong Dong Tan Le ◽  
Ziqing Hu ◽  
Lizhen Lin

AbstractIn this paper, we propose another topological approach for DNA similarity analysis. For each DNA sequence, we transform it into a collection of vectors in 5-dimensional space in which all nucleotides of the same type, say A, C, G, T are on the same line in this 5D space. Based on this special geometric property, we combine this representation with tools in persistent homology to obtain only zeroth persistence diagrams as a topological representation of DNA sequences. Similarities between DNA sequences are signified via how close the representing zeroth persistence diagrams of the DNA sequences are, based on the Wasserstein distance of order zero, which provides a new method for analyzing similarities between DNA sequences. We test our methods on the datasets of Human rhinovirus (HRV) and Influenza A virus.


2018 ◽  
Author(s):  
Jean-Baptiste Bardin ◽  
Gard Spreemann ◽  
Kathryn Hess

AbstractOne of the paramount challenges in neuroscience is to understand the dynamics of individual neurons and how they give rise to network dynamics when interconnected. Historically, researchers have resorted to graph theory, statistics, and statistical mechanics to describe the spatiotemporal structure of such network dynamics. Our novel approach employs tools from algebraic topology to characterize the global properties of network structure and dynamics.We propose a method based on persistent homology to automatically classify network dynamics using topological features of spaces built from various spike-train distances. We investigate the efficacy of our method by simulating activity in three small artificial neural networks with different sets of parameters, giving rise to dynamics that can be classified into four regimes. We then compute three measures of spike train similarity and use persistent homology to extract topological features that are fundamentally different from those used in traditional methods. Our results show that a machine learning classifier trained on these features can accurately predict the regime of the network it was trained on and also generalize to other networks that were not presented during training. Moreover, we demonstrate that using features extracted from multiple spike-train distances systematically improves the performance of our method.


2019 ◽  
Vol 3 (3) ◽  
pp. 725-743 ◽  
Author(s):  
Jean-Baptiste Bardin ◽  
Gard Spreemann ◽  
Kathryn Hess

One of the paramount challenges in neuroscience is to understand the dynamics of individual neurons and how they give rise to network dynamics when interconnected. Historically, researchers have resorted to graph theory, statistics, and statistical mechanics to describe the spatiotemporal structure of such network dynamics. Our novel approach employs tools from algebraic topology to characterize the global properties of network structure and dynamics. We propose a method based on persistent homology to automatically classify network dynamics using topological features of spaces built from various spike train distances. We investigate the efficacy of our method by simulating activity in three small artificial neural networks with different sets of parameters, giving rise to dynamics that can be classified into four regimes. We then compute three measures of spike train similarity and use persistent homology to extract topological features that are fundamentally different from those used in traditional methods. Our results show that a machine learning classifier trained on these features can accurately predict the regime of the network it was trained on and also generalize to other networks that were not presented during training. Moreover, we demonstrate that using features extracted from multiple spike train distances systematically improves the performance of our method.


Author(s):  
Natarajan Ramanathan ◽  
Jayalakshmi Ramamurthy ◽  
Ganapathy Natarajan

Background: Biological macromolecules namely, DNA, RNA, and protein have their building blocks organized in a particular sequence and the sequential arrangement encodes evolutionary history of the organism (species). Hence, biological sequences have been used for studying evolutionary relationships among the species. This is usually carried out by multiple sequence algorithms (MSA). Due to certain limitations of MSA, alignment-free sequence comparison methods were developed. The present review is on alignment-free sequence comparison methods carried out using numerical characterization of DNA sequences. Discussion: The graphical representation of DNA sequences by chaos game representation and other 2-dimesnional and 3-dimensional methods are discussed. The evolution of numerical characterization from the various graphical representations and the application of the DNA invariants thus computed in phylogenetic analysis is presented. The extension of computing molecular descriptors in chemometrics to the calculation of new set of DNA invariants and their use in alignment-free sequence comparison in a N-dimensional space and construction of phylogenetic tress is also reviewed. Conclusion: The phylogenetic tress constructed by the alignment-free sequence comparison methods using DNA invariants were found to be better than those constructed using alignment-based tools such as PHLYIP and ClustalW. One of the graphical representation methods is now extended to study viral sequences of infectious diseases for the identification of conserved regions to design peptide-based vaccine by combining numerical characterization and graphical representation.


Pneumologie ◽  
2014 ◽  
Vol 68 (02) ◽  
Author(s):  
C Tarnow ◽  
G Engels ◽  
A Arendt ◽  
F Schwalm ◽  
H Sediri ◽  
...  

Planta Medica ◽  
2016 ◽  
Vol 81 (S 01) ◽  
pp. S1-S381
Author(s):  
U Grienke ◽  
M Richter ◽  
E Walther ◽  
A Hoffmann ◽  
J Kirchmair ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document