scholarly journals Novel metric for hyperbolic phylogenetic tree embeddings

2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Hirotaka Matsumoto ◽  
Takahiro Mimori ◽  
Tsukasa Fukunaga

Abstract Advances in experimental technologies, such as DNA sequencing, have opened up new avenues for the applications of phylogenetic methods to various fields beyond their traditional application in evolutionary investigations, extending to the fields of development, differentiation, cancer genomics, and immunogenomics. Thus, the importance of phylogenetic methods is increasingly being recognized, and the development of a novel phylogenetic approach can contribute to several areas of research. Recently, the use of hyperbolic geometry has attracted attention in artificial intelligence research. Hyperbolic space can better represent a hierarchical structure compared to Euclidean space, and can therefore be useful for describing and analyzing a phylogenetic tree. In this study, we developed a novel metric that considers the characteristics of a phylogenetic tree for representation in hyperbolic space. We compared the performance of the proposed hyperbolic embeddings, general hyperbolic embeddings, and Euclidean embeddings, and confirmed that our method could be used to more precisely reconstruct evolutionary distance. We also demonstrate that our approach is useful for predicting the nearest-neighbor node in a partial phylogenetic tree with missing nodes. Furthermore, we proposed a novel approach based on our metric to integrate multiple trees for analyzing tree nodes or imputing missing distances. This study highlights the utility of adopting a geometric approach for further advancing the applications of phylogenetic methods.

2020 ◽  
Author(s):  
Hirotaka Matsumoto ◽  
Takahiro Mimori ◽  
Tsukasa Fukunaga

Advances in experimental technologies such as DNA sequencing have opened up new avenues for the applications of phylogenetic methods to various fields beyond their traditional application in evolutionary investigations, extending to the fields of development, differentiation, cancer genomics, and immunogenomics. Thus, the importance of phylogenetic methods is increasingly being recognized, and the development of a novel phylogenetic approach can contribute to several areas of research. Recently, the use of hyperbolic geometry has attracted attention in artificial intelligence research. Hyperbolic space can better represent a hierarchical structure compared to Euclidean space, and can therefore be useful for describing and analyzing a phylogenetic tree. In this study, we developed a novel metric that considers the characteristics of a phylogenetic tree for representation in hyperbolic space. We compared the performance of the proposed hyperbolic embeddings, general hyperbolic embeddings, and Euclidean embeddings, and confirmed that our method could be used to more precisely reconstruct evolutionary distance. We also demonstrate that our approach is useful for predicting the nearest-neighbor node in a partial phylogenetic tree with missing nodes. This study highlights the utility of adopting a geometric approach for further advancing the applications of phylogenetic methods.The demo code is attached as a supplementary file in a compiled jupyter notebook. The code used for analyses is available on GitHub at https://github.com/hmatsu1226/HyPhyTree.


2020 ◽  
Author(s):  
aras Masood Ismael ◽  
Ömer F Alçin ◽  
Karmand H Abdalla ◽  
Abdulkadir k sengur

Abstract In this paper, a novel approach that is based on two-stepped majority voting is proposed for efficient EEG based emotion classification. Emotion recognition is important for human-machine interactions. Facial-features and body-gestures based approaches have been generally proposed for emotion recognition. Recently, EEG based approaches become more popular in emotion recognition. In the proposed approach, the raw EEG signals are initially low-pass filtered for noise removal and band-pass filters are used for rhythms extraction. For each rhythm, the best performed EEG channels are determined based on wavelet-based entropy features and fractal dimension based features. The k-nearest neighbor (KNN) classifier is used in classification. The best five EEG channels are used in majority voting for getting the final predictions for each EEG rhythm. In the second majority voting step, the predictions from all rhythms are used to get a final prediction. The DEAP dataset is used in experiments and classification accuracy, sensitivity and specificity are used for performance evaluation metrics. The experiments are carried out to classify the emotions into two binary classes such as high valence (HV) vs low valence (LV) and high arousal (HA) vs low arousal (LA). The experiments show that 86.3% HV vs LV discrimination accuracy and 85.0% HA vs LA discrimination accuracy is obtained. The obtained results are also compared with some of the existing methods. The comparisons show that the proposed method has potential in the use of EEG based emotion classification.


2013 ◽  
pp. 363-380
Author(s):  
Horst Bunke ◽  
Kaspar Riesen

The domain of graphs contains only little mathematical structure. That is, most of the basic mathematical operations, actually required by many standard computer vision and pattern recognition algorithms, are not available for graphs. One of the few mathematical concepts that has been successfully transferred from the vector space to the graph domain is distance computation between graphs, commonly referred to as graph matching. Yet, distance-based pattern recognition is basically limited to nearest-neighbor classification. The present chapter reviews a novel approach for graph embedding in vector spaces built upon the concept of graph matching. The key-idea of the proposed embedding method is to use the distances of an input graph to a number of training graphs, termed prototypes, as vectorial description of the graph. That is, all graph matching procedures proposed in the literature during the last decades can be employed in this embedding framework. The rationale for such a graph embedding is to bridge the gap between the high representational power and flexibility of graphs and the large amount of algorithms available for object representations in terms of feature vectors. Hence, the proposed framework can be considered a contribution towards unifying the domains of structural and statistical pattern recognition.


Author(s):  
Bisma Shah ◽  
Farheen Siddiqui

Others' opinions can be decisive while choosing among various options, especially when those choices involve worthy resources like spending time and money buying products or services. Customers relying on their peers' past reviews on e-commerce websites or social media have drawn a considerable interest to sentiment analysis due to realization of its commercial and business benefits. Sentiment analysis can be exercised on movie reviews, blogs, customer feedback, etc. This chapter presents a novel approach to perform sentiment analysis of movie reviews given by users on different websites. Also, challenges like presence of thwarted words, world knowledge, and subjectivity detection in sentiments are addressed in this chapter. The results are validated by using two supervised machine learning approaches, k-nearest neighbor and naive Bayes, both on method of sentiment analysis without addressing aforementioned challenges and on proposed method of sentiment analysis with all challenges addressed. Empirical results show that proposed method outperformed the one that left challenges unaddressed.


2015 ◽  
Vol 11 (A29A) ◽  
pp. 209-209
Author(s):  
Bo Han ◽  
Hongpeng Ding ◽  
Yanxia Zhang ◽  
Yongheng Zhao

AbstractCatastrophic failure is an unsolved problem existing in the most photometric redshift estimation approaches for a long history. In this study, we propose a novel approach by integration of k-nearest-neighbor (KNN) and support vector machine (SVM) methods together. Experiments based on the quasar sample from SDSS show that the fusion approach can significantly mitigate catastrophic failure and improve the accuracy of photometric redshift estimation.


Author(s):  
CHRISTOPHER LUTSKO

Abstract We prove a theorem describing the limiting fine-scale statistics of orbits of a point in hyperbolic space under the action of a discrete subgroup. Similar results have been proved only in the lattice case with two recent infinite-volume exceptions by Zhang for Apollonian circle packings and certain Schottky groups. Our results hold for general Zariski dense, non-elementary, geometrically finite subgroups in any dimension. Unlike in the lattice case orbits of geometrically finite subgroups do not necessarily equidistribute on the whole boundary of hyperbolic space. But rather they may equidistribute on a fractal subset. Understanding the behavior of these orbits near the boundary is central to Patterson–Sullivan theory and much further work. Our theorem characterises the higher order spatial statistics and thus addresses a very natural question. As a motivating example our work applies to sphere packings (in any dimension) which are invariant under the action of such discrete subgroups. At the end of the paper we show how this statistical characterization can be used to prove convergence of moments and to write down the limiting formula for the two-point correlation function and nearest neighbor distribution. Moreover we establish a formula for the 2 dimensional limiting gap distribution (and cumulative gap distribution) which also applies in the lattice case.


2020 ◽  
Vol 38 (6_suppl) ◽  
pp. 570-570
Author(s):  
Sergiusz Wesolowski ◽  
Roberto Nussenzveig ◽  
Victor Sacristan Santos ◽  
John Esther ◽  
Divyam Goel ◽  
...  

570 Background: AI is increasingly being used in clinical cancer genomics research. Probabilistic Graphical Models (PGMs) are AI algorithms that capture multivariate, mutli-level dependencies in complex patterns in large datasets while retaining human interpretability. We hypothesize that PGMs can identify clinical and genomic features that correlate with IO response in patients (pts) with mUC. Methods: In this retrospective study eligibility criteria were: diagnosis of mUC, receipt of IO for mUC, comprehensive genomic profiling data available from CLIA certified labs. The Bayesian Network (BN, PGM based AI) was used to discover clinical characteristics and selected genomic alterations relevant to IO response by RECIST 1.1 (investigator assessed). Results: Overall, 95 pts (73 men) with mUC were evaluated. 45 (47%) were ever smokers.The presented BN correctly captured the clinical landscape of mUC explaining significant relationship between included variables (p<0.0001). Ever smokers and pts with de novo metastasis had higher TMB and better response to IO. Inactivating MLL2 alterations were more prevalent in non-smokers, and negatively correlated with response to IO. FGFR3 alterations did not predict response to IO. Significant associations are presented in Table. Conclusions: These hypothesis-generating data (by a novel approach, i.e. PGM based AI) showed that smoking and high-TMB were associated with improved response to IO; in contrast, inactivating MLL2 alternations and visceral metastasis predicted inferior response. FGFR3 alterations did not correlate with response. This model validated previous findings and found new hypothesis-generating relationship, such as altered MLL2 gene; external validation is needed.[Table: see text]


Author(s):  
SHITALA PRASAD ◽  
GYANENDRA K. VERMA ◽  
BHUPESH KUMAR SINGH ◽  
PIYUSH KUMAR

This paper, proposes a novel approach for feature extraction based on the segmentation and morphological alteration of handwritten multi-lingual characters. We explored multi-resolution and multi-directional transforms such as wavelet, curvelet and ridgelet transform to extract classifying features of handwritten multi-lingual images. Evaluating the pros and cons of each multi-resolution algorithm has been discussed and resolved that Curvelet-based features extraction is most promising for multi-lingual character recognition. We have also applied some morphological operation such as thinning and thickening then feature level fusion is performed in order to create robust feature vector for classification. The classification is performed with K-nearest neighbor (K-NN) and support vector machine (SVM) classifier with their relative performance. We experiment with our in-house dataset, compiled in our lab by more than 50 personnel.


Sign in / Sign up

Export Citation Format

Share Document