scholarly journals Counting, Generating, Analyzing and Sampling Tree Alignments

2018 ◽  
Vol 29 (05) ◽  
pp. 741-767 ◽  
Author(s):  
Cedric Chauve ◽  
Julien Courtiel ◽  
Yann Ponty

Pairwise ordered tree alignment are combinatorial objects that appear in important applications, such as RNA secondary structure comparison. However, the usual representation of tree alignments as supertrees is ambiguous, i.e. two distinct supertrees may induce identical sets of matches between identical pairs of trees. This ambiguity is uninformative, and detrimental to any probabilistic analysis. In this work, we consider tree alignments up to equivalence. Our first result is a precise asymptotic enumeration of tree alignments, obtained from a context-free grammar by mean of basic analytic combinatorics. Our second result focuses on alignments between two given ordered trees [Formula: see text] and [Formula: see text]. By refining our grammar to align specific trees, we obtain a decomposition scheme for the space of alignments, and use it to design an efficient dynamic programming algorithm for sampling alignments under the Gibbs-Boltzmann probability distribution. This generalizes existing tree alignment algorithms, and opens the door for a probabilistic analysis of the space of suboptimal alignments.

Author(s):  
Longjian Gao ◽  
Chengzhen Xu ◽  
Wangan Song ◽  
Feng Xiao ◽  
Xiaomin Wu ◽  
...  

Background: With increasing applications and development of high-throughput sequencing, knowledge of the primary structure of RNA has expanded exponentially. Moreover, the function of RNA is determined by the secondary or higher RNA structure, and similar structures are related to similar functions, such as the secondary clover structure of tRNA. Therefore, RNA structure alignment is an important subject in computational biology and bioinformatics to accurately predict function. However, the traditional RNA structure alignment algorithms have some drawbacks such as high complexity and easy loss of secondary structure information. Objective: To study RNA secondary structure alignment according to the shortcomings of existing secondary structure alignment algorithms and the characteristics of RNA secondary structure. Method: We propose a new digital sequence RNA structure representation algorithm named “DSARna” . Then based on a dynamic programming algorithm, the scoring matrix and binary path matrix are simultaneously constructed. The backtracking path is identified in the path matrix, and the optimal result is predicted according to the path length. Conclusions: Upon comparison with the existing SimTree algorithm through experimental analysis, the proposed method showed higher accuracy and could ensure that the structural information is not easily lost in terms of improved specificity, sensitivity, and the Matthews correlation coefficient.


2017 ◽  
Author(s):  
Michela Quadrini ◽  
Luca Tesei ◽  
Emanuela Merelli ◽  

The methods proposed in the literature for RNA comparison focus mainly on pseudoknot free structures. The comparison of pseudoknotted structures is still a challenge. In this work, we propose a new algebraic representation of RNA secondary structures based on relations among hairpins in terms of nesting, crossing, and concatenation. Such algebraic representation is obtained from a defined multiple context-free grammar, which maps any kind of RNA secondary structures into extended trees, i.e., ordered trees where internal nodes are labeled with algebraic operators and leaves are labeled with loops. These extended trees permit the definition of the RNA secondary structure comparison as a tree alignment problem.


2015 ◽  
Author(s):  
Tomáš Flouri ◽  
Kassian Kobert ◽  
Torbjørn Rognes ◽  
Alexandros Stamatakis

While implementing the algorithm, we discovered two mathematical mistakes in Gotoh's paper that induce sub-optimal sequence alignments. First, there are minor indexing mistakes in the dynamic programming algorithm which become apparent immediately when implementing the procedure. Hence, we report on these for the sake of completeness. Second, there is a more profound problem with the dynamic programming matrix initialization. This initialization issue can easily be missed and find its way into actual implementations. This error is also present in standard text books. Namely, the widely used books by Gusfield and Waterman. To obtain an initial estimate of the extent to which this error has been propagated, we scrutinized freely available undergraduate lecture slides. We found that 8 out of 31 lecture slides contained the mistake, while 16 out of 31 simply omit parts of the initialization, thus giving an incomplete description of the algorithm. Finally, by inspecting ten source codes and running respective tests, we found that five implementations were incorrect. Note that, not all bugs we identified are due to the mistake in Gotoh's paper. Three implementations rely on additional constraints that limit generality. Thus, only two out of ten yield correct results. We show that the error introduced by Gotoh is straightforward to resolve and provide a correct open-source reference implementation. We do believe though, that raising the awareness about these errors is critical, since the impact of incorrect pairwise sequence alignments that typically represent one of the very first stages in any bioinformatics data analysis pipeline can have a detrimental impact on downstream analyses such as multiple sequence alignment, orthology assignment, phylogenetic analyses, divergence time estimates, etc.


2017 ◽  
Author(s):  
Michela Quadrini ◽  
Luca Tesei ◽  
Emanuela Merelli ◽  

The methods proposed in the literature for RNA comparison focus mainly on pseudoknot free structures. The comparison of pseudoknotted structures is still a challenge. In this work, we propose a new algebraic representation of RNA secondary structures based on relations among hairpins in terms of nesting, crossing, and concatenation. Such algebraic representation is obtained from a defined multiple context-free grammar, which maps any kind of RNA secondary structures into extended trees, i.e., ordered trees where internal nodes are labeled with algebraic operators and leaves are labeled with loops. These extended trees permit the definition of the RNA secondary structure comparison as a tree alignment problem.


Author(s):  
Jin Yu ◽  
Pengfei Shen ◽  
Zhao Wang ◽  
Yurun Song ◽  
Xiaohan Dong

Heavy duty vehicles, especially special vehicles, including wheel loaders and sprinklers, generally work with drastic changes in load. With the usage of a conventional hydraulic mechanical transmission, they face with these problems such as low efficiency, high fuel consumption and so forth. Some scholars focus on the research to solve these issues. However, few of them take into optimal strategies the fluctuation of speed ratio change, which can also cause a lot of problems. In this study, a novel speed regulation is proposed which cannot only solve problems above but also overcome impact caused by speed ratio change. Initially, based on the former research of the Compound Coupled Hydro-mechanical Transmission (CCHMT), the basic characteristics of CCHMT are analyzed. Besides, to solve these problems, dynamic programming algorithm is utilized to formulate basic speed regulation strategy under specific operating condition. In order to reduce the problem caused by speed ratio change, a new optimization is applied. The results indicate that the proposed DP optimal speed regulation strategy has better performance on reducing fuel consumption by up to 1.16% and 6.66% in driving cycle JN1015 and in ECE R15 working condition individually, as well as smoothing the fluctuation of speed ratio by up to 12.65% and 19.01% in those two driving cycles respectively. The processes determining the speed regulation strategy can provide a new method to formulate the control strategies of CCHMT under different operating conditions particularlly under real-world conditions.


Sign in / Sign up

Export Citation Format

Share Document