scholarly journals Classification of apatite structures via topological data analysis: a framework for a ‘Materials Barcode’ representation of structure maps

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Scott Broderick ◽  
Ruhil Dongol ◽  
Tianmu Zhang ◽  
Krishna Rajan

AbstractThis paper introduces the use of topological data analysis (TDA) as an unsupervised machine learning tool to uncover classification criteria in complex inorganic crystal chemistries. Using the apatite chemistry as a template, we track through the use of persistent homology the topological connectivity of input crystal chemistry descriptors on defining similarity between different stoichiometries of apatites. It is shown that TDA automatically identifies a hierarchical classification scheme within apatites based on the commonality of the number of discrete coordination polyhedra that constitute the structural building units common among the compounds. This information is presented in the form of a visualization scheme of a barcode of homology classifications, where the persistence of similarity between compounds is tracked. Unlike traditional perspectives of structure maps, this new “Materials Barcode” schema serves as an automated exploratory machine learning tool that can uncover structural associations from crystal chemistry databases, as well as to achieve a more nuanced insight into what defines similarity among homologous compounds.

2021 ◽  
Vol 30 (05) ◽  
pp. 2150025
Author(s):  
Chengyuan Wu ◽  
Carol Anne Hargreaves

Topological data analysis is a relatively new branch of machine learning that excels in studying high-dimensional data, and is theoretically known to be robust against noise. Meanwhile, data objects with mixed numeric and categorical attributes are ubiquitous in real-world applications. However, topological methods are usually applied to point cloud data, and to the best of our knowledge there is no available framework for the classification of mixed data using topological methods. In this paper, we propose a novel topological machine learning method for mixed data classification. In the proposed method, we use theory from topological data analysis such as persistent homology, persistence diagrams and Wasserstein distance to study mixed data. The performance of the proposed method is demonstrated by experiments on a real-world heart disease dataset. Experimental results show that our topological method outperforms several state-of-the-art algorithms in the prediction of heart disease.


Author(s):  
Firas A. Khasawneh ◽  
Elizabeth Munch

This paper introduces a simple yet powerful approach based on topological data analysis for detecting true steps in a periodic, piecewise constant (PWC) signal. The signal is a two-state square wave with randomly varying in-between-pulse spacing, subject to spurious steps at the rising or falling edges which we call digital ringing. We use persistent homology to derive mathematical guarantees for the resulting change detection which enables accurate identification and counting of the true pulses. The approach is tested using both synthetic and experimental data obtained using an engine lathe instrumented with a laser tachometer. The described algorithm enables accurate and automatic calculations of the spindle speed without any choice of parameters. The results are compared with the frequency and sequency methods of the Fourier and Walsh–Hadamard transforms, respectively. Both our approach and the Fourier analysis yield comparable results for pulses with regular spacing and digital ringing while the latter causes large errors using the Walsh–Hadamard method. Further, the described approach significantly outperforms the frequency/sequency analyses when the spacing between the peaks is varied. We discuss generalizing the approach to higher dimensional PWC signals, although using this extension remains an interesting question for future research.


2018 ◽  
Vol 9 ◽  
Author(s):  
Mao Li ◽  
Hong An ◽  
Ruthie Angelovici ◽  
Clement Bagaza ◽  
Albert Batushansky ◽  
...  

2021 ◽  
Author(s):  
Anna Suzuki ◽  
Miyuki Miyazawa ◽  
James Minto ◽  
Takeshi Tsuji ◽  
Ippei Obayashi ◽  
...  

Abstract Topological data analysis is an emerging concept of data analysis for characterizing shapes. A state-of-the-art tool in topological data analysis is persistent homology, which is expected to summarize quantified topological and geometric features. Although persistent homology is useful for revealing the topological and geometric information, it is difficult to interpret the parameters of persistent homology themselves and difficult to directly relate the parameters to physical properties. In this study, we focus on connectivity and apertures of flow channels detected from persistent homology analysis. We propose a method to estimate permeability in fracture networks from parameters of persistent homology. Synthetic 3D fracture network patterns and their direct flow simulations are used for the validation. The results suggest that the persistent homology can estimate fluid flow in fracture network based on the image data. This method can easily derive the flow phenomena based on the information of the structure.


2021 ◽  
Vol 9 ◽  
Author(s):  
Peter Tsung-Wen Yen ◽  
Siew Ann Cheong

In recent years, persistent homology (PH) and topological data analysis (TDA) have gained increasing attention in the fields of shape recognition, image analysis, data analysis, machine learning, computer vision, computational biology, brain functional networks, financial networks, haze detection, etc. In this article, we will focus on stock markets and demonstrate how TDA can be useful in this regard. We first explain signatures that can be detected using TDA, for three toy models of topological changes. We then showed how to go beyond network concepts like nodes (0-simplex) and links (1-simplex), and the standard minimal spanning tree or planar maximally filtered graph picture of the cross correlations in stock markets, to work with faces (2-simplex) or any k-dim simplex in TDA. By scanning through a full range of correlation thresholds in a procedure called filtration, we were able to examine robust topological features (i.e. less susceptible to random noise) in higher dimensions. To demonstrate the advantages of TDA, we collected time-series data from the Straits Times Index and Taiwan Capitalization Weighted Stock Index (TAIEX), and then computed barcodes, persistence diagrams, persistent entropy, the bottleneck distance, Betti numbers, and Euler characteristic. We found that during the periods of market crashes, the homology groups become less persistent as we vary the characteristic correlation. For both markets, we found consistent signatures associated with market crashes in the Betti numbers, Euler characteristics, and persistent entropy, in agreement with our theoretical expectations.


Author(s):  
Bartosz Zieliński ◽  
Michał Lipiński ◽  
Mateusz Juda ◽  
Matthias Zeppelzauer ◽  
Paweł Dłotko

Persistent homology (PH) is a rigorous mathematical theory that provides a robust descriptor of data in the form of persistence diagrams (PDs). PDs exhibit, however, complex structure and are difficult to integrate in today's machine learning workflows. This paper introduces persistence bag-of-words: a novel and stable vectorized representation of PDs that enables the seamless integration with machine learning. Comprehensive experiments show that the new representation achieves state-of-the-art performance and beyond in much less time than alternative approaches.


2021 ◽  
Vol 7 (2) ◽  
pp. 488-491
Author(s):  
Yashbir Singh ◽  
William Jons ◽  
Gian Marco Conte ◽  
Jaidip Jagtap ◽  
Kuan Zhang ◽  
...  

Abstract Primary sclerosis cholangitis (PSC) predisposes individuals to liver failure, but it is challenging for radiologists examining radiologic images to predict which patients with PSC will ultimately develop liver failure. Motivated by algebraic topology, a topological data analysis - inspired framework was adopted in the study of the imaging pattern between the “Early Decompensation” and “Not Early” groups. The results demonstrate that the proposed methodology discriminates “Early Decompensation” and “Not Early” groups. Our study is the first attempt to provide a topological representation-based method into early hepatic decompensation and not early groups.


Mathematics ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. 770
Author(s):  
Matteo Rucco ◽  
Giovanna Viticchi ◽  
Lorenzo Falsetti

Glioblastoma multiforme (GBM) is a fast-growing and highly invasive brain tumor, which tends to occur in adults between the ages of 45 and 70 and it accounts for 52 percent of all primary brain tumors. Usually, GBMs are detected by magnetic resonance images (MRI). Among MRI, a fluid-attenuated inversion recovery (FLAIR) sequence produces high quality digital tumor representation. Fast computer-aided detection and segmentation techniques are needed for overcoming subjective medical doctors (MDs) judgment. This study has three main novelties for demonstrating the role of topological features as new set of radiomics features which can be used as pillars of a personalized diagnostic systems of GBM analysis from FLAIR. For the first time topological data analysis is used for analyzing GBM from three complementary perspectives—tumor growth at cell level, temporal evolution of GBM in follow-up period and eventually GBM detection. The second novelty is represented by the definition of a new Shannon-like topological entropy, the so-called Generator Entropy. The third novelty is the combination of topological and textural features for training automatic interpretable machine learning. These novelties are demonstrated by three numerical experiments. Topological Data Analysis of a simplified 2D tumor growth mathematical model had allowed to understand the bio-chemical conditions that facilitate tumor growth—the higher the concentration of chemical nutrients the more virulent the process. Topological data analysis was used for evaluating GBM temporal progression on FLAIR recorded within 90 days following treatment completion and at progression. The experiment had confirmed that persistent entropy is a viable statistics for monitoring GBM evolution during the follow-up period. In the third experiment we developed a novel methodology based on topological and textural features and automatic interpretable machine learning for automatic GBM classification on FLAIR. The algorithm reached a classification accuracy up to 97%.


Sign in / Sign up

Export Citation Format

Share Document