scholarly journals Cluster Analysis of Haze Episodes Based on Topological Features

2020 ◽  
Vol 12 (10) ◽  
pp. 3985
Author(s):  
Nur Fariha Syaqina Zulkepli ◽  
Mohd Salmi Md Noorani ◽  
Fatimah Abdul Razak ◽  
Munira Ismail ◽  
Mohd Almie Alias

Severe haze episodes have periodically occurred in Southeast Asia, specifically taunting Malaysia with adverse effects. A technique called cluster analysis was used to analyze these occurrences. Traditional cluster analysis, in particular, hierarchical agglomerative cluster analysis (HACA), was applied directly to data sets. The data sets may contain hidden patterns that can be explored. In this paper, this underlying information was captured via persistent homology, a topological data analysis (TDA) tool, which extracts topological features including components, holes, and cavities in the data sets. In particular, an improved version of HACA was proposed by combining HACA and persistent homology. Additionally, a comparative study between traditional HACA and improved HACA was done using particulate matter data, which was the major pollutant found during haze episodes by the Klang, Petaling Jaya, and Shah Alam air quality monitoring stations. The effectiveness of these two clustering approaches was evaluated based on their ability to cluster the months according to the haze condition. The results showed that clustering based on topological features via the improved HACA approach was able to correctly group the months with severe haze compared to clustering them without such features, and these results were consistent for all three locations.

2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Ali Nabi Duman ◽  
Harun Pirim

Persistent homology, a topological data analysis (TDA) method, is applied to microarray data sets. Although there are a few papers referring to TDA methods in microarray analysis, the usage of persistent homology in the comparison of several weighted gene coexpression networks (WGCN) was not employed before to the very best of our knowledge. We calculate the persistent homology of weighted networks constructed from 38 Arabidopsis microarray data sets to test the relevance and the success of this approach in distinguishing the stress factors. We quantify multiscale topological features of each network using persistent homology and apply a hierarchical clustering algorithm to the distance matrix whose entries are pairwise bottleneck distance between the networks. The immunoresponses to different stress factors are distinguishable by our method. The networks of similar immunoresponses are found to be close with respect to bottleneck distance indicating the similar topological features of WGCNs. This computationally efficient technique analyzing networks provides a quick test for advanced studies.


2019 ◽  
Vol 55 (4) ◽  
pp. 631-670
Author(s):  
Daria Bębeniec ◽  
Małgorzata Cudna

Abstract In this article, we present a corpus-based analysis of two major types of the Polish Complete Path (CP) construction in which a source-PP, headed by od+GEN, is immediately followed by a goal-PP, headed by do+GEN or po+ACC, as in od jesieni 1920 do jesieni 1921 ‘from autumn 1920 to autumn 1921’ and od kreskówek po rysunki techniczne ‘from cartoons to technical drawings’. The aim of the study is to shed some light on the polysemous structure of the CP construction on the basis of its usage patterns. To this end, we used a random sample of over 500 instances of both construction types retrieved from the National Corpus of Polish. The data were annotated for a number formal and semantic features and subsequently explored using hierarchical agglomerative cluster analysis. When interpreting the results of several analyses performed on different sets of variables, we gave special attention to three levels of semantic granularity encoded in the data, concluding that, on the whole, all analyses point towards a distinction between the spatial, temporal and abstract meanings of the construction under investigation.


2021 ◽  
Vol 9 ◽  
Author(s):  
Peter Tsung-Wen Yen ◽  
Siew Ann Cheong

In recent years, persistent homology (PH) and topological data analysis (TDA) have gained increasing attention in the fields of shape recognition, image analysis, data analysis, machine learning, computer vision, computational biology, brain functional networks, financial networks, haze detection, etc. In this article, we will focus on stock markets and demonstrate how TDA can be useful in this regard. We first explain signatures that can be detected using TDA, for three toy models of topological changes. We then showed how to go beyond network concepts like nodes (0-simplex) and links (1-simplex), and the standard minimal spanning tree or planar maximally filtered graph picture of the cross correlations in stock markets, to work with faces (2-simplex) or any k-dim simplex in TDA. By scanning through a full range of correlation thresholds in a procedure called filtration, we were able to examine robust topological features (i.e. less susceptible to random noise) in higher dimensions. To demonstrate the advantages of TDA, we collected time-series data from the Straits Times Index and Taiwan Capitalization Weighted Stock Index (TAIEX), and then computed barcodes, persistence diagrams, persistent entropy, the bottleneck distance, Betti numbers, and Euler characteristic. We found that during the periods of market crashes, the homology groups become less persistent as we vary the characteristic correlation. For both markets, we found consistent signatures associated with market crashes in the Betti numbers, Euler characteristics, and persistent entropy, in agreement with our theoretical expectations.


2021 ◽  
Author(s):  
◽  
Seungho Choe

Persistent homology is a powerful tool in topological data analysis (TDA) to compute, study and encode efficiently multi-scale topological features and is being increasingly used in digital image classification. The topological features represent number of connected components, cycles, and voids that describe the shape of data. Persistent homology extracts the birth and death of these topological features through a filtration process. The lifespan of these features can represented using persistent diagrams (topological signatures). Cubical homology is a more efficient method for extracting topological features from a 2D image and uses a collection of cubes to compute the homology, which fits the digital image structure of grids. In this research, we propose a cubical homology-based algorithm for extracting topological features from 2D images to generate their topological signatures. Additionally, we propose a score, which measures the significance of each of the sub-simplices in terms of persistence. Also, gray level co-occurrence matrix (GLCM) and contrast limited adapting histogram equalization (CLAHE) are used as a supplementary method for extracting features. Machine learning techniques are then employed to classify images using the topological signatures. Among the eight tested algorithms with six published image datasets with varying pixel sizes, classes, and distributions, our experiments demonstrate that cubical homology-based machine learning with deep residual network (ResNet 1D) and Light Gradient Boosting Machine (lightGBM) shows promise with the extracted topological features.


2021 ◽  
Author(s):  
Gunnar Carlsson ◽  
Mikael Vejdemo-Johansson

The continued and dramatic rise in the size of data sets has meant that new methods are required to model and analyze them. This timely account introduces topological data analysis (TDA), a method for modeling data by geometric objects, namely graphs and their higher-dimensional versions: simplicial complexes. The authors outline the necessary background material on topology and data philosophy for newcomers, while more complex concepts are highlighted for advanced learners. The book covers all the main TDA techniques, including persistent homology, cohomology, and Mapper. The final section focuses on the diverse applications of TDA, examining a number of case studies drawn from monitoring the progression of infectious diseases to the study of motion capture data. Mathematicians moving into data science, as well as data scientists or computer scientists seeking to understand this new area, will appreciate this self-contained resource which explains the underlying technology and how it can be used.


2019 ◽  
Vol 10 (4) ◽  
pp. 861-889 ◽  
Author(s):  
Mariluz Fernandez-Alles ◽  
Juan Pablo Diánez-González ◽  
Tamara Rodríguez-González ◽  
Mercedes Villanueva-Flores

Purpose The purpose of this paper is to analyze potentially significant differences in a series of relevant characteristics of universities’ technology transfer offices (TTOs). To this end, TTOs have been classified by the function of their resources assigned to the enhancement of university entrepreneurship. The factors analyzed are the number of academic spin-offs created with the support of TTOs as well as the TTOs’ age, experience, professionalization and relational capital. Design/methodology/approach The authors have performed a hierarchical agglomerative cluster analysis to identify the groups of TTOs with homogeneous behavior and features. This multivariate technique allows determining whether it is possible to identify some differentiated conglomerates of TTOs. Findings The results of the cluster analysis allow concluding that the number of academic spin-offs created with the support of TTOs, the age and degree of professionalization of these TTOs, the experiences of their employees in matters related to entrepreneurship and their relationships with market actors explain the different levels of commitment of TTOs toward the enhancement of university entrepreneurship. In contrast with the expected results, the relationship between TTOs and academic actors does not seem to explain such differences. Originality/value This research contributes to the identification of the particular design characteristics that TTOs should exhibit to promote the entrepreneurial performance of universities, offering important recommendations to academic institutions regarding the efficient design of TTOs to manage university ambidexterity and to build TTOs’ entrepreneurial identity.


Sign in / Sign up

Export Citation Format

Share Document