Calculating PSSM probabilities with lazy dynamic programming

2005 ◽  
Vol 16 (1) ◽  
pp. 75-81 ◽  
Author(s):  
KETIL MALDE ◽  
ROBERT GIEGERICH

Position-specific scoring matrices are one way to represent approximate string patterns, which are commonly encountered in the field of bioinformatics. An important problem that arises with their application is calculating the statistical significance of matches. We review the currently most efficient algorithm for this task, and show how it can be implemented in Haskell, taking advantage of the built-in non-strictness of the language. The resulting program turns out to be an instance of dynamic programming, using lists rather the typical dynamic programming matrix.

Author(s):  
Anggar Titis Prayitno

ABSTRACT  Traveling Salesman Problem (TSP) is one of combinatorics optimation problem to find the possible shorthest path that can be obtained if a  salesman visit each city exactly once and return to the starting city. The shorthest path searching can be done by Cheapest Insertion Heuristics algorithm and Dynamic Programming. Each algorithm has different efficiency to find shorthest path. Algorithm efficiency is determined based on time complexity. Algorithm wich has the smallest time complexity is the most efficient algorithm. Based on the calculation result, the time complexity of Cheapest Insertion Heuristics algorithm is and Dynamic Programming is .  Therefore, for  Cheapest Insertion Heuristics Algorithm is more efficient algorithm than Dynamic Programming in TSP solving. Keywords : Traveling Salesman Problem, Cheapest Insertion Heuristics  Algorithm, Dynamic Programming, and Algorithm time complexity.


Author(s):  
Louis J. Cochrane ◽  
Derek Gatherer

The Needleman-Wunsch process is a classic tool in bioinformatics, being a dynamic programming algorithm that performs a pairwise alignment of two input biological sequences, either protein or nucleic acid. A distance matrix between the tokens used in the sequences is also required as input. The distance matrix is used to generate a positional pairwise similarity matrix between the input sequences, which is in turn used to generate a dynamic programming matrix. The best path through the dynamic programming matrix is navigated using a traceback procedure that maximises similarity, inserting gaps as necessary. Needleman-Wunsch can align both nucleic acids or proteins, which use alphabets of size 4 and 20 tokens respectively. It can also be applied to any other kind of sequence where distance matrices can be specified. Here, we apply it to chains of Pousseur’s Scambi electronic music fragments, of which there are 32, and which Pousseur categorised by their sonic properties, thus permitting the consecutive construction of distance, similarity and dynamic programming matrices. Traceback through the dynamic programming matrix thus produces contrapuntal duet compositions in which two Scambi chains are played in the maximally euphonious manner, providing also an illustration of the principles of biological sequence alignment in sound.


1999 ◽  
Vol 9 (4) ◽  
pp. 373-382
Author(s):  
Jacques D. Retief ◽  
Kevin R. Lynch ◽  
William R. Pearson

We have developed a rapid visual method for identifying novel members of gene families. Starting with an evolutionary tree, 20–50 protein query sequences for a gene family are selected from different branches of the tree. These query sequences are used to search the GenBank and expressed sequence tag (EST) DNA databases and their nightly updates using the tfastx3 or tfasty3 programs. The results of all 20–50 searches are collated and resorted to highlight EST or genomic sequences that share significant similarity with the query sequences. The statistical significance of each DNA/protein alignment is plotted, highlighting the portion of the query sequence that is present in the database sequence and the percent identity in the aligned region. The collated results for database sequences are linked using the WWW to the underlying scores and alignments; these links can also be used to perform additional searches to characterize the novel sequence further. With traditional “deep” scoring matrices (BLOSUM50) one can search for previously unrecognized families of large protein superfamilies. Alternatively, by using query sequences and EST libraries from the same species (e.g., human or mouse) together with “shallow” scoring matrices and filters that remove high-identity sequences, one can highlight new paralogs of previously described subfamilies. Using query sequences from the glutathione transferase superfamily, we identified two novel mammalian glutathione transferase families that were recognized previously only in plants. Using query sequences from known mammalian glutathione transferase subfamilies, we identified new candidate paralogs from the mouse class-mu, class-pi, and class-theta families.


BIOPHYSICS ◽  
2018 ◽  
Vol 63 (3) ◽  
pp. 311-317 ◽  
Author(s):  
S. N. Petrov ◽  
L. A. Uroshlev ◽  
A. S. Kasyanov ◽  
V. Yu. Makeev

2021 ◽  
Vol 9 (09) ◽  
pp. 86-90
Author(s):  
Sakirudeen A. Abdulsalaam ◽  

We developed an efficient algorithm that generates optimal stope layout for an underground mine. After a mining site has been identified and an exploration has been done, the data gathered is analysed and a modelling technique is applied to produce an ore body. The ore body is divided into thousands of mining blocks in three dimensions. The blocks are assigned values per tonne. The miners desire a stope layout which maximizes the mine value. In this paper, we present a fast algorithm that generates the stope layout efficiently without violating the physical constraints.


Sign in / Sign up

Export Citation Format

Share Document