scholarly journals Measuring hidden phenotype: Quantifying the shape of barley seeds using the Euler Characteristic Transform

Author(s):  
Erik J Amézquita ◽  
Michelle Y Quigley ◽  
Tim Ophelders ◽  
Jacob B Landis ◽  
Daniel Koenig ◽  
...  

Abstract Shape plays a fundamental role in biology. Traditional phenotypic analysis methods measure some features but fail to measure the information embedded in shape comprehensively. To extract, compare, and analyze this information embedded in a robust and concise way, we turn to Topological Data Analysis (TDA), specifically the Euler Characteristic Transform. TDA measures shape comprehensively using mathematical representations based on algebraic topology features. To study its use, we compute both traditional and topological shape descriptors to quantify the morphology of 3121 barley seeds scanned with X-ray Computed Tomography (CT) technology at 127 micron resolution. The Euler Characteristic Transform measures shape by analyzing topological features of an object at thresholds across a number of directional axes. A Kruskal-Wallis analysis of the information encoded by the topological signature reveals that the Euler Characteristic Transform picks up successfully the shape of the crease and bottom of the seeds. Moreover, while traditional shape descriptors can cluster the seeds based on their accession, topological shape descriptors can cluster them further based on their panicle. We then successfully train a support vector machine (SVM) to classify 28 different accessions of barley based exclusively on the shape of their grains. We observe that combining both traditional and topological descriptors classifies barley seeds better than using just traditional descriptors alone. This improvement suggests that TDA is thus a powerful complement to traditional morphometrics to comprehensively describe a multitude of “hidden” shape nuances which are otherwise not detected.

2021 ◽  
Author(s):  
Erik J. Amézquita ◽  
Michelle Y. Quigley ◽  
Tim Ophelders ◽  
Jacob B. Landis ◽  
Daniel Koenig ◽  
...  

AbstractShape plays a fundamental role in biology. Traditional phenotypic analysis methods measure some features but fail to measure the information embedded in shape comprehensively. To extract, compare, and analyze this information embedded in a robust and concise way, we turn to Topological Data Analysis (TDA), specifically the Euler Characteristic Transform (ECT). TDA measures shape comprehensively using mathematical terms based on algebraic topology features. To study its use, we compute both traditional and topological shape descriptors to quantify the morphology of 3121 barley seeds scanned with X-ray Computed Tomography (CT) technology at 127 micron resolution. The ECT measures shape by analyzing topological features of an object at thresholds across a number of directional axes. We optimize the number of directions and thresholds for classification to 158 and 8 respectively, creating vectors of length 1264 that are topological signatures for each barley seed. Using these vectors, we successfully train a support vector machine to classify 28 different accessions of barley based on the 3D shape of their grains. We observe that combining both traditional and topological descriptors classifies barley seeds to their correct accession better than using just traditional descriptors alone. This improvement suggests that TDA is thus a powerful complement to traditional morphometrics to describe comprehensively a multitude of shape nuances which are otherwise not picked up. Using TDA we can quantify aspects of phenotype that have remained “hidden” without its use, and the ECT opens the possibility of accurately reconstructing objects from their topological signatures.


2019 ◽  
Author(s):  
Edgar Amorim ◽  
Rodrigo A. Moreira ◽  
Fernando A N Santos

In this work, we use methods and concepts of applied algebraic topology to comprehensively explore topological phase transitions in complex systems. Topological phase transitions are characterized by the zeros of the Euler characteristic (EC) or by singularities of the Euler entropy and also indicate signal changes in the mean node curvature of networks. Here, we provide strong evidence that the zeros of the Euler characteristic can be interpreted as a complex network’s intrinsic fingerprint. We theoretically and empirically illustrate this across different biological networks: We first target our investigation to protein-protein interaction networks (PPIN). To do so, we used methods of topological data analysis to compute the Euler characteristic analytically, and the Betti numbers numerically as a function of the attachment probability for two variants of the Duplication Divergence model, namely the totally asymmetric model and the heterodimerization model. We contrast our theoretical results with experimental data freely available for gene co-expression networks (GCN) of Saccharomyces cerevisiae, also known as baker’s yeast, as well as of the nematode Caenorhabditis elegans. Supporting our theoretical expectations, we are able to detect topological phase transitions in both networks obtained according to different similarity measures. Later, we theoretically illustrate the emergence of topological phase transitions in three classical network models, namely the Watts-Strogratz model, the Random Geometric Graph, and the Barabasi-Albert model. Given the universality and wide use of those models across disciplines, our results indicate that topological phase transitions may permeate across a wide range of theoretical and empirical networks. Hereby, our paper reinforces the idea of using topological phase transitions to advance the understanding of complex systems more generally.


2021 ◽  
Vol 83 (3) ◽  
Author(s):  
Maria-Veronica Ciocanel ◽  
Riley Juenemann ◽  
Adriana T. Dawes ◽  
Scott A. McKinley

AbstractIn developmental biology as well as in other biological systems, emerging structure and organization can be captured using time-series data of protein locations. In analyzing this time-dependent data, it is a common challenge not only to determine whether topological features emerge, but also to identify the timing of their formation. For instance, in most cells, actin filaments interact with myosin motor proteins and organize into polymer networks and higher-order structures. Ring channels are examples of such structures that maintain constant diameters over time and play key roles in processes such as cell division, development, and wound healing. Given the limitations in studying interactions of actin with myosin in vivo, we generate time-series data of protein polymer interactions in cells using complex agent-based models. Since the data has a filamentous structure, we propose sampling along the actin filaments and analyzing the topological structure of the resulting point cloud at each time. Building on existing tools from persistent homology, we develop a topological data analysis (TDA) method that assesses effective ring generation in this dynamic data. This method connects topological features through time in a path that corresponds to emergence of organization in the data. In this work, we also propose methods for assessing whether the topological features of interest are significant and thus whether they contribute to the formation of an emerging hole (ring channel) in the simulated protein interactions. In particular, we use the MEDYAN simulation platform to show that this technique can distinguish between the actin cytoskeleton organization resulting from distinct motor protein binding parameters.


Mathematics ◽  
2021 ◽  
Vol 9 (6) ◽  
pp. 634
Author(s):  
Tarek Frahi ◽  
Francisco Chinesta ◽  
Antonio Falcó ◽  
Alberto Badias ◽  
Elias Cueto ◽  
...  

We are interested in evaluating the state of drivers to determine whether they are attentive to the road or not by using motion sensor data collected from car driving experiments. That is, our goal is to design a predictive model that can estimate the state of drivers given the data collected from motion sensors. For that purpose, we leverage recent developments in topological data analysis (TDA) to analyze and transform the data coming from sensor time series and build a machine learning model based on the topological features extracted with the TDA. We provide some experiments showing that our model proves to be accurate in the identification of the state of the user, predicting whether they are relaxed or tense.


2021 ◽  
Author(s):  
Dong Quan Ngoc Nguyen ◽  
Phuong Dong Tan Le ◽  
Lin Xing ◽  
Lizhen Lin

AbstractMethods for analyzing similarities among DNA sequences play a fundamental role in computational biology, and have a variety of applications in public health, and in the field of genetics. In this paper, a novel geometric and topological method for analyzing similarities among DNA sequences is developed, based on persistent homology from algebraic topology, in combination with chaos geometry in 4-dimensional space as a graphical representation of DNA sequences. Our topological framework for DNA similarity analysis is general, alignment-free, and can deal with DNA sequences of various lengths, while proving first-of-the-kind visualization features for visual inspection of DNA sequences directly, based on topological features of point clouds that represent DNA sequences. As an application, we test our methods on three datasets including genome sequences of different types of Hantavirus, Influenza A viruses, and Human Papillomavirus.


PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0253851
Author(s):  
Grzegorz Graff ◽  
Beata Graff ◽  
Paweł Pilarczyk ◽  
Grzegorz Jabłoński ◽  
Dariusz Gąsecki ◽  
...  

Heart rate variability (hrv) is a physiological phenomenon of the variation in the length of the time interval between consecutive heartbeats. In many cases it could be an indicator of the development of pathological states. The classical approach to the analysis of hrv includes time domain methods and frequency domain methods. However, attempts are still being made to define new and more effective hrv assessment tools. Persistent homology is a novel data analysis tool developed in the recent decades that is rooted at algebraic topology. The Topological Data Analysis (TDA) approach focuses on examining the shape of the data in terms of connectedness and holes, and has recently proved to be very effective in various fields of research. In this paper we propose the use of persistent homology to the hrv analysis. We recall selected topological descriptors used in the literature and we introduce some new topological descriptors that reflect the specificity of hrv, and we discuss their relation to the standard hrv measures. In particular, we show that this novel approach provides a collection of indices that might be at least as useful as the classical parameters in differentiating between series of beat-to-beat intervals (RR-intervals) in healthy subjects and patients suffering from a stroke episode.


2021 ◽  
Author(s):  
Erik Amezquita ◽  
Michelle Quigley ◽  
Tim Ophelders ◽  
Jacob B Landis ◽  
Daniel Koenig ◽  
...  

2021 ◽  
Vol 22 ◽  
Author(s):  
Jeaneth Machicao ◽  
Francesco Craighero ◽  
Davide Maspero ◽  
Fabrizio Angaroni ◽  
Chiara Damiani ◽  
...  

Background: The increasing availability of omics data collected from patients affected by severe pathologies, such as cancer, is fostering the development of data science methods for their analysis. Introduction: The combination of data integration and machine learning approaches can provide new powerful instruments to tackle the complexity of cancer development and deliver effective diagnostic and prognostic strategies. Methods: We explore the possibility of exploiting the topological properties of sample-specific metabolic networks as features in a supervised classification task. Such networks are obtained by projecting transcriptomic data from RNA-seq experiments on genome-wide metabolic models to define weighted networks modeling the overall metabolic activity of a given sample. Results: We show the classification results on a labeled breast cancer dataset from the TCGA database, including 210 samples (cancer vs. normal). In particular, we investigate how the performance is affected by a threshold-based pruning of the networks by comparing Artificial Neural Networks, Support Vector Machines and Random Forests. Interestingly, the best classification performance is achieved within a small threshold range for all methods, suggesting that it might represent an effective choice to recover useful information while filtering out noise from data. Overall, the best accuracy is achieved with SVMs, which exhibit performances similar to those obtained when gene expression profiles are used as features. Conclusion: These findings demonstrate that the topological properties of sample-specific metabolic networks are effective in classifying cancer and normal samples, suggesting that useful information can be extracted from a relatively limited number of features.


2021 ◽  
Vol 7 (2) ◽  
pp. 867-870
Author(s):  
Vinothini Selvaraju ◽  
Karthick Pa ◽  
Ramakrishnan Swaminathan

Abstract Detection of preterm birth (gestational week < 37) is a global priority as it causes major health problems to neonates. Assessment of uterine contractions (burst) is required to detect and prevent the threat of preterm birth. Uterine electromyography (uEMG) is widely preferred to measure the uterine contractions noninvasively. These signals are nonstationary in nature. It can be handled by topological data analysis (TDA) effectively. Therefore, TDA can be used to explore the characteristics of uEMG burst signals. In this study, an attempt has been made to distinguish term (gestational week ≥ 37) and preterm conditions using timefrequency based topological features in uEMG burst signals. These signals are obtained from the publicly available online dataset. The annotated burst signals are segmented and subjected to a short time Fourier transform. The transformed real and imaginary Fourier coefficients are plotted in the complex plane and the envelope of the data points are computed using the alpha-shape technique. Four topological features such as, area, perimeter, circularity and ellipse variance are extracted. These features are statistically analyzed. The coefficient of variation (CoV) is calculated to measure the inter-subject variations. The results show that the proposed method is able to discriminate between term and preterm conditions. The extracted features namely, area and perimeter exhibit significant difference (p < 0.05) between these two conditions. The CoV of the perimeter is observed to be low, implying that this feature can handle inter-subject variations in burst signals. The extracted topological features are useful to analyze the characteristics of term and preterm pregnancies


Sign in / Sign up

Export Citation Format

Share Document