Hidden Markov models in biological sequence analysis

2001 ◽  
Vol 45 (3.4) ◽  
pp. 449-454 ◽  
Author(s):  
E. Birney
2019 ◽  
Vol 35 (19) ◽  
pp. 3829-3830 ◽  
Author(s):  
Shaun P Wilkinson

Abstract Summary Hidden Markov models (HMMs) and profile HMMs form an integral part of biological sequence analysis, supporting an ever-growing list of applications. The aphid R package can be used to derive, train, plot, import and export HMMs and profile HMMs in the R environment. Computationally-intensive dynamic programing recursions, such as the Viterbi, forward and backward algorithms are implemented in C++ and parallelized for increased speed and efficiency. Availability and implementation The aphid package is released under the GPL-3 license, and is freely available for download from CRAN and GitHub (https://github.com/shaunpwilkinson/aphid). Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (24) ◽  
pp. 5309-5312
Author(s):  
Ioannis A Tamposis ◽  
Konstantinos D Tsirigos ◽  
Margarita C Theodoropoulou ◽  
Panagiota I Kontou ◽  
Georgios N Tsaousis ◽  
...  

Abstract Summary JUCHMME is an open-source software package designed to fit arbitrary custom Hidden Markov Models (HMMs) with a discrete alphabet of symbols. We incorporate a large collection of standard algorithms for HMMs as well as a number of extensions and evaluate the software on various biological problems. Importantly, the JUCHMME toolkit includes several additional features that allow for easy building and evaluation of custom HMMs, which could be a useful resource for the research community. Availability and implementation http://www.compgen.org/tools/juchmme, https://github.com/pbagos/juchmme. Supplementary information Supplementary data are available at Bioinformatics online.


2013 ◽  
Vol 411-414 ◽  
pp. 2106-2110
Author(s):  
Shi Ping Du ◽  
Jian Wang ◽  
Yu Ming Wei

A hidden Markov model (HMM) encompasses a large class of stochastic process models and has been successfully applied to a number of scientific and engineering problems, including speech and other pattern recognition problems, and biological sequence analysis. A major restriction is found, however, in conventional HMM, i.e., it is ill-suited to capture the interactions among different models. A variety of coupled hidden Markov models (CHMMs) have recently been proposed as extensions of HMM to better characterize multiple interdependent sequences. The resulting models have multiple state variables that are temporally coupled via matrices of conditional probabilities. This paper study is focused on the coupled discrete HMM, there are two state variables in the network. By generalizing forward-backward algorithm, Viterbi algorithm and Baum-Welch algorithm commonly used in conventional HMM to accommodate two state variables, several new formulae solving the 2-chain coupled discrete HMM probability evaluation, decoding and training problem are theoretically derived.


2018 ◽  
Vol 35 (13) ◽  
pp. 2208-2215 ◽  
Author(s):  
Ioannis A Tamposis ◽  
Konstantinos D Tsirigos ◽  
Margarita C Theodoropoulou ◽  
Panagiota I Kontou ◽  
Pantelis G Bagos

Abstract Motivation Hidden Markov Models (HMMs) are probabilistic models widely used in applications in computational sequence analysis. HMMs are basically unsupervised models. However, in the most important applications, they are trained in a supervised manner. Training examples accompanied by labels corresponding to different classes are given as input and the set of parameters that maximize the joint probability of sequences and labels is estimated. A main problem with this approach is that, in the majority of the cases, labels are hard to find and thus the amount of training data is limited. On the other hand, there are plenty of unclassified (unlabeled) sequences deposited in the public databases that could potentially contribute to the training procedure. This approach is called semi-supervised learning and could be very helpful in many applications. Results We propose here, a method for semi-supervised learning of HMMs that can incorporate labeled, unlabeled and partially labeled data in a straightforward manner. The algorithm is based on a variant of the Expectation-Maximization (EM) algorithm, where the missing labels of the unlabeled or partially labeled data are considered as the missing data. We apply the algorithm to several biological problems, namely, for the prediction of transmembrane protein topology for alpha-helical and beta-barrel membrane proteins and for the prediction of archaeal signal peptides. The results are very promising, since the algorithms presented here can significantly improve the prediction performance of even the top-scoring classifiers. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document