scholarly journals PathRacer: racing profile HMM paths on assembly graph

2019 ◽  
Author(s):  
Alexander Shlemov ◽  
Anton Korobeynikov

AbstractRecently large databases containing profile Hidden Markov Models (pHMMs) emerged. These pHMMs may represent the sequences of antibiotic resistance genes, or allelic variations amongst highly conserved housekeeping genes used for strain typing, etc. The typical application of such a database includes the alignment of contigs to pHMM hoping that the sequence of gene of interest is located within the single contig. Such a condition is often violated for metagenomes preventing the effective use of such databases.We present PathRacer — a novel standalone tool that aligns profile HMM directly to the assembly graph (performing the codon translation on fly for amino acid pHMMs). The tool provides the set of most probable paths traversed by a HMM through the whole assembly graph, regardless whether the sequence of interested is encoded on the single contig or scattered across the set of edges, therefore significantly improving the recovery of sequences of interest even from fragmented metagenome assemblies.Availabilityhttp://cab.spbu.ru/software/pathracer/

2006 ◽  
Vol 04 (05) ◽  
pp. 959-980 ◽  
Author(s):  
CHENHONG ZHANG ◽  
MIKELIS G. BICKIS ◽  
FANG-XIANG WU ◽  
ANTHONY J. KUSALIK

Hidden Markov models (HMMs) are one of various methods that have been applied to prediction of major histo-compatibility complex (MHC) binding peptide. In terms of model topology, a fully-connected HMM (fcHMM) has the greatest potential to predict binders, at the cost of intensive computation. While a profile HMM (pHMM) performs dramatically fewer computations, it potentially merges overlapping patterns into one which results in some patterns being missed. In a profile HMM a state corresponds to a position on a peptide while in an fcHMM a state has no specific biological meaning. This work proposes optimally-connected HMMs (ocHMMs), which do not merge overlapping patterns and yet, by performing topological reductions, a model's connectivity is greatly reduced from an fcHMM. The parameters of ocHMMs are initialized using a novel amino acid grouping approach called "multiple property grouping." Each group represents a state in an ocHMM. The proposed ocHMMs are compared to a pHMM implementation using HMMER, based on performance tests on two MHC alleles HLA (Human Leukocyte Antigen)-A*0201 and HLA-B*3501. The results show that the heuristic approaches can be adjusted to make an ocHMM achieve higher predictive accuracy than HMMER. Hence, such obtained ocHMMs are worthy of trial for predicting MHC-binding peptides.


Author(s):  
STEFAN MÜLLER ◽  
STEFAN EICKELER ◽  
GERHARD RIGOLL

An integrated approach to shape and color-based image retrieval, where the cues color and shape are both utilized in a local rather than a global way, is presented in this paper. An experimental retrieval system has been developed, which enables the user to search a color image database intuitively by presenting simple sketches. In order to be able to perform an elastic matching, which is especially needed in sketch-based image retrieval, objects in the images are represented by Hidden Markov Models. The use of streams (sets of features that are assumed to be statistically independent) within the HMM framework allows the integration of shape and color derived features into a single model, thereby allowing to control the influence of the different streams via stream weights. The approach has been evaluated on a color image database containing 120 different isolated objects with arbitrary orientation and showed good retrieval results with several users. Furthermore, the use of HMMs allows efficient pruning and thus a fast retrieval even with large databases.


2019 ◽  
Author(s):  
Takuya Aramaki ◽  
Romain Blanc-Mathieu ◽  
Hisashi Endo ◽  
Koichi Ohkubo ◽  
Minoru Kanehisa ◽  
...  

AbstractSummaryKofamKOALA is a web server to assign KEGG Orthologs (KOs) to protein sequences by homology search against a database of profile hidden Markov models (KOfam) with pre-computed adaptive score thresholds. KofamKOALA is faster than existing KO assignment tools with its accuracy being comparable to the best performing tools. Function annotation by KofamKOALA helps linking genes to KEGG resources such as the KEGG pathway maps and facilitates molecular network reconstruction.AvailabilityKofamKOALA, KofamScan, and KOfam are freely available from https://www.genome.jp/tools/kofamkoala/[email protected]


2019 ◽  
Vol 36 (7) ◽  
pp. 2251-2252 ◽  
Author(s):  
Takuya Aramaki ◽  
Romain Blanc-Mathieu ◽  
Hisashi Endo ◽  
Koichi Ohkubo ◽  
Minoru Kanehisa ◽  
...  

Abstract Summary KofamKOALA is a web server to assign KEGG Orthologs (KOs) to protein sequences by homology search against a database of profile hidden Markov models (KOfam) with pre-computed adaptive score thresholds. KofamKOALA is faster than existing KO assignment tools with its accuracy being comparable to the best performing tools. Function annotation by KofamKOALA helps linking genes to KEGG resources such as the KEGG pathway maps and facilitates molecular network reconstruction. Availability and implementation KofamKOALA, KofamScan and KOfam are freely available from GenomeNet (https://www.genome.jp/tools/kofamkoala/). Supplementary information Supplementary data are available at Bioinformatics online.


2015 ◽  
Vol 135 (12) ◽  
pp. 1517-1523 ◽  
Author(s):  
Yicheng Jin ◽  
Takuto Sakuma ◽  
Shohei Kato ◽  
Tsutomu Kunitachi

Author(s):  
M. Vidyasagar

This book explores important aspects of Markov and hidden Markov processes and the applications of these ideas to various problems in computational biology. It starts from first principles, so that no previous knowledge of probability is necessary. However, the work is rigorous and mathematical, making it useful to engineers and mathematicians, even those not interested in biological applications. A range of exercises is provided, including drills to familiarize the reader with concepts and more advanced problems that require deep thinking about the theory. Biological applications are taken from post-genomic biology, especially genomics and proteomics. The topics examined include standard material such as the Perron–Frobenius theorem, transient and recurrent states, hitting probabilities and hitting times, maximum likelihood estimation, the Viterbi algorithm, and the Baum–Welch algorithm. The book contains discussions of extremely useful topics not usually seen at the basic level, such as ergodicity of Markov processes, Markov Chain Monte Carlo (MCMC), information theory, and large deviation theory for both i.i.d and Markov processes. It also presents state-of-the-art realization theory for hidden Markov models. Among biological applications, it offers an in-depth look at the BLAST (Basic Local Alignment Search Technique) algorithm, including a comprehensive explanation of the underlying theory. Other applications such as profile hidden Markov models are also explored.


Sign in / Sign up

Export Citation Format

Share Document