scholarly journals Isomorphic Unordered Labeled Trees up to Substitution Ciphering

Author(s):  
Florian Ingels ◽  
Romain Azaïs
Keyword(s):  
2009 ◽  
Vol 57 (1) ◽  
pp. 1-33 ◽  
Author(s):  
Paolo Ferragina ◽  
Fabrizio Luccio ◽  
Giovanni Manzini ◽  
S. Muthukrishnan
Keyword(s):  

Author(s):  
Saverio Caminiti ◽  
Irene Finocchi ◽  
Rossella Petreschi

2008 ◽  
pp. 3235-3251
Author(s):  
Yongqiao Xiao ◽  
Jenq-Foung Yao ◽  
Guizhen Yang

Recent years have witnessed a surge of research interest in knowledge discovery from data domains with complex structures, such as trees and graphs. In this paper, we address the problem of mining maximal frequent embedded subtrees which is motivated by such important applications as mining “hot” spots of Web sites from Web usage logs and discovering significant “deep” structures from tree-like bioinformatic data. One major challenge arises due to the fact that embedded subtrees are no longer ordinary subtrees, but preserve only part of the ancestor-descendant relationships in the original trees. To solve the embedded subtree mining problem, in this article we propose a novel algorithm, called TreeGrow, which is optimized in two important respects. First, it obtains frequency counts of root-to-leaf paths through efficient compression of trees, thereby being able to quickly grow an embedded subtree pattern path by path instead of node by node. Second, candidate subtree generation is highly localized so as to avoid unnecessary computational overhead. Experimental results on benchmark synthetic data sets have shown that our algorithm can outperform unoptimized methods by up to 20 times.


BMC Genomics ◽  
2020 ◽  
Vol 21 (S10) ◽  
Author(s):  
Samuel Briand ◽  
Christophe Dessimoz ◽  
Nadia El-Mabrouk ◽  
Manuel Lafond ◽  
Gabriela Lobinska

Abstract Background The Robinson-Foulds (RF) distance is a well-established measure between phylogenetic trees. Despite a lack of biological justification, it has the advantages of being a proper metric and being computable in linear time. For phylogenetic applications involving genes, however, a crucial aspect of the trees ignored by the RF metric is the type of the branching event (e.g. speciation, duplication, transfer, etc). Results We extend RF to trees with labeled internal nodes by including a node flip operation, alongside edge contractions and extensions. We explore properties of this extended RF distance in the case of a binary labeling. In particular, we show that contrary to the unlabeled case, an optimal edit path may require contracting “good” edges, i.e. edges shared between the two trees. Conclusions We provide a 2-approximation algorithm which is shown to perform well empirically. Looking ahead, computing distances between labeled trees opens up a variety of new algorithmic directions.Implementation and simulations available at https://github.com/DessimozLab/pylabeledrf.


2007 ◽  
Vol 114 (7) ◽  
pp. 1357-1361 ◽  
Author(s):  
Seunghyun Seo ◽  
Heesung Shin
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document