Extending hidden Markov models to allow conditioning on previous observations

2018 ◽  
Vol 16 (05) ◽  
pp. 1850019 ◽  
Author(s):  
Ioannis A. Tamposis ◽  
Margarita C. Theodoropoulou ◽  
Konstantinos D. Tsirigos ◽  
Pantelis G. Bagos

Hidden Markov Models (HMMs) are probabilistic models widely used in computational molecular biology. However, the Markovian assumption regarding transition probabilities which dictates that the observed symbol depends only on the current state may not be sufficient for some biological problems. In order to overcome the limitations of the first order HMM, a number of extensions have been proposed in the literature to incorporate past information in HMMs conditioning either on the hidden states, or on the observations, or both. Here, we implement a simple extension of the standard HMM in which the current observed symbol (amino acid residue) depends both on the current state and on a series of observed previous symbols. The major advantage of the method is the simplicity in the implementation, which is achieved by properly transforming the observation sequence, using an extended alphabet. Thus, it can utilize all the available algorithms for the training and decoding of HMMs. We investigated the use of several encoding schemes and performed tests in a number of important biological problems previously studied by our team (prediction of transmembrane proteins and prediction of signal peptides). The evaluation shows that, when enough data are available, the performance increased by 1.8%–8.2% and the existing prediction methods may improve using this approach. The methods, for which the improvement was significant (PRED-TMBB2, PRED-TAT and HMM-TM), are available as web-servers freely accessible to academic users at www.compgen.org/tools/ .

2017 ◽  
Vol 9 (1) ◽  
pp. 24
Author(s):  
Hiroshi Morimoto

Cold exposure is often said to trigger the incidence of cerebral infarctions and ischemic heart disease. This association between weather and human health has attracted considerable interest, and has been explored using standard statistical techniques such as regression models. Meteorological factors, such as temperature, are controlled by background systems, notably weather patterns. Therefore, it is reasonable to posit that the incidence of diseases is similarly influenced by a background system. The aim of this paper was to identify and construct these respective background systems. Possible background states or "hidden states", behind the incidence of diseases were derived using the EM and Viterbi algorithms with in the framework of hidden Markov models (HMM). A self-organizing map (SOM) enabled identification of weather patterns, considered as background states behind meteorological factors. These background states were then compared, and the hidden states behind the incidence of diseases were identified by six weather patterns. This finding indicates new evidence of the links between weather and human health, shedding light on the association between changes in the weather and the onset of disease. 


2018 ◽  
Author(s):  
Regev Schweiger ◽  
Yaniv Erlich ◽  
Shai Carmi

MotivationHidden Markov models (HMMs) are powerful tools for modeling processes along the genome. In a standard genomic HMM, observations are drawn, at each genomic position, from a distribution whose parameters depend on a hidden state; the hidden states evolve along the genome as a Markov chain. Often, the hidden state is the Cartesian product of multiple processes, each evolving independently along the genome. Inference in these so-called Factorial HMMs has a naïve running time that scales as the square of the number of possible states, which by itself increases exponentially with the number of subchains; such a running time scaling is impractical for many applications. While faster algorithms exist, there is no available implementation suitable for developing bioinformatics applications.ResultsWe developed FactorialHMM, a Python package for fast exact inference in Factorial HMMs. Our package allows simulating either directly from the model or from the posterior distribution of states given the observations. Additionally, we allow the inference of all key quantities related to HMMs: (1) the (Viterbi) sequence of states with the highest posterior probability; (2) the likelihood of the data; and (3) the posterior probability (given all observations) of the marginal and pairwise state probabilities. The running time and space requirement of all procedures is linearithmic in the number of possible states. Our package is highly modular, providing the user with maximal flexibility for developing downstream applications.Availabilityhttps://github.com/regevs/factorialhmm


2019 ◽  
Vol 19 (1) ◽  
pp. 93-99 ◽  
Author(s):  
Maxim A. Kuznetsov ◽  
Ivan V. Oseledets

AbstractWe propose a new algorithm for spectral learning of Hidden Markov Models (HMM). In contrast to the standard approach, we do not estimate the parameters of the HMM directly, but construct an estimate for the joint probability distribution. The idea is based on the representation of a joint probability distribution as an N-th-order tensor with low ranks represented in the tensor train (TT) format. Using TT-format, we get an approximation by minimizing the Frobenius distance between the empirical joint probability distribution and tensors with low TT-ranks with core tensors normalization constraints. We propose an algorithm for the solution of the optimization problem that is based on the alternating least squares (ALS) approach and develop its fast version for sparse tensors. The order of the tensor d is a parameter of our algorithm. We have compared the performance of our algorithm with the existing algorithm by Hsu, Kakade and Zhang proposed in 2009 and found that it is much more robust if the number of hidden states is overestimated.


2003 ◽  
Vol 7 (5) ◽  
pp. 652-667 ◽  
Author(s):  
M. F. Lambert ◽  
J. P. Whiting ◽  
A. V. Metcalfe

Abstract. Hidden Markov models (HMMs) can allow for the varying wet and dry cycles in the climate without the need to simulate supplementary climate variables. The fitting of a parametric HMM relies upon assumptions for the state conditional distributions. It is shown that inappropriate assumptions about state conditional distributions can lead to biased estimates of state transition probabilities. An alternative non-parametric model with a hidden state structure that overcomes this problem is described. It is shown that a two-state non-parametric model produces accurate estimates of both transition probabilities and the state conditional distributions. The non-parametric model can be used directly or as a technique for identifying appropriate state conditional distributions to apply when fitting a parametric HMM. The non-parametric model is fitted to data from ten rainfall stations and four streamflow gauging stations at varying distances inland from the Pacific coast of Australia. Evidence for hydrological persistence, though not mathematical persistence, was identified in both rainfall and streamflow records, with the latter showing hidden states with longer sojourn times. Persistence appears to increase with distance from the coast. Keywords: Hidden Markov models, non-parametric, two-state model, climate states, persistence, probability distributions


2010 ◽  
Vol 2010 ◽  
pp. 1-11 ◽  
Author(s):  
Sergio Benini ◽  
Pierangelo Migliorati ◽  
Riccardo Leonardi

We present a statistical framework based on Hidden Markov Models(HMMs)for skimming feature films. A chain ofHMMsis used to model subsequent story units:HMMstates represent different visual-concepts, transitions model the temporal dependencies in each story unit, and stochastic observations are given by single shots. The skim is generated as an observation sequence, where, in order to privilege more informative segments for entering the skim, shots are assigned higher probability of observation if endowed with salient features related to specific film genres. The effectiveness of the method is demonstrated by skimming the first thirty minutes of a wide set of action and dramatic movies, in order to create previews for users useful for assessing whether they would like to see that movie or not, but without revealing the movie central part and plot details. Results are evaluated and compared through extensive user tests in terms of metrics that estimate the content representational value of the obtained video skims and their utility for assessing the user's interest in the observed movie.


2020 ◽  
Author(s):  
Brett T. McClintock

AbstractHidden Markov models (HMMs) that include individual-level random effects have recently been promoted for inferring animal movement behaviour from biotelemetry data. These “mixed HMMs” come at significant cost in terms of implementation and computation, and discrete random effects have been advocated as a practical alternative to more computationally-intensive continuous random effects. However, the performance of mixed HMMs has not yet been sufficiently explored to justify their widespread adoption, and there is currently little guidance for practitioners weighing the costs and benefits of mixed HMMs for a particular research objective.I performed an extensive simulation study comparing the performance of a suite of fixed and random effect models for individual heterogeneity in the hidden state process of a 2-state HMM. I focused on sampling scenarios more typical of telemetry studies, which often consist of relatively long time series (30 – 250 observations per animal) for relatively few individuals (5 – 100 animals).I generally found mixed HMMs did not improve state assignment relative to standard HMMs. Reliable estimation of random effects required larger sample sizes than are often feasible in telemetry studies. Continuous random effect models performed reasonably well with data generated under discrete random effects, but not vice versa. Random effects accounting for unexplained individual variation can improve estimation of state transition probabilities and measurable covariate effects, but discrete random effects can be a relatively poor (and potentially misleading) approximation for continuous variation.When weighing the costs and benefits of mixed HMMs, three important considerations are study objectives, sample size, and model complexity. HMM applications often focus on state assignment with little emphasis on heterogeneity in state transition probabilities, in which case random effects in the hidden state process simply may not be worth the additional effort. However, if explaining variation in state transition probabilities is a primary objective and sufficient explanatory covariates are not available, then random effects are worth pursuing as a more parsimonious alternative to individual fixed effects.To help put my findings in context and illustrate some potential challenges that practitioners may encounter when applying mixed HMMs, I revisit a previous analysis of long-finned pilot whale biotelemetry data.


d'CARTESIAN ◽  
2015 ◽  
Vol 4 (1) ◽  
pp. 86 ◽  
Author(s):  
Kezia Tumilaar ◽  
Yohanes Langi ◽  
Altien Rindengan

Hidden Markov Models (HMM) is a stochastic model and is essentially an extension of Markov Chain. In Hidden Markov Model (HMM)  there are two types states: the observable states and the hidden states. The purpose of this research are to understand how hidden Markov model (HMM) and to understand how the solution of three basic problems on Hidden Markov Model (HMM) which consist of evaluation problem, decoding problem and learning problem.  The result of the research is hidden Markov model can be defined as . The evaluation problem or to compute probability of the observation sequence given the model P(O|) can solved  by Forward-Backward algorithm, the decoding problem or to choose hidden state sequence which is optimal can solved by Viterbi algorithm and learning problem or to estimate hidden Markov model parameter  to maximize P(O|)  can solved by Baum – Welch algorithm. From description above Hidden Markov Model  with state 3  can describe behavior  from the case studies. Key  words: Decoding Problem, Evaluation Problem, Hidden Markov Model, Learning Problem


2019 ◽  
Vol 46 (4) ◽  
pp. 296
Author(s):  
Victoria L. Goodall ◽  
Sam M. Ferreira ◽  
Paul J. Funston ◽  
Nkabeng Maruping-Mzileni

Context Direct observations of animals are the most reliable way to define their behavioural characteristics; however, to obtain these observations is costly and often logistically challenging. GPS tracking allows finer-scale interpretation of animal responses by measuring movement patterns; however, the true behaviour of the animal during the period of observation is seldom known. Aims The aim of our research was to draw behavioural inferences for a lioness with a hidden Markov model and to validate the predicted latent-state sequence with field observations of the lion pride. Methods We used hidden Markov models to model the movement of a lioness in the Kruger National Park, South Africa. A three-state log-normal model was selected as the most suitable model. The model outputs are related to collected data by using an observational model, such as, for example, a distribution for the average movement rate and/or direction of movement that depends on the underlying model states that are taken to represent behavioural states of the animal. These inferred behavioural states are validated against direct observation of the pride’s behaviour. Key results Average movement rate provided a useful alternative for the application of hidden Markov models to irregularly spaced GPS locations. The movement model predicted resting as the dominant activity throughout the day, with a peak in the afternoon. The local-movement state occurred consistently throughout the day, with a decreased proportion during the afternoon, when more resting takes place, and an increase towards the early evening. The relocating state had three peaks, namely, during mid-morning, early evening and about midnight. Because of the differences in timing of the direct observations and the GPS locations, we had to compare point observations of the true behaviour with an interval prediction of the modelled behavioural state. In 75% of the cases, the model-predicted behaviour and the field-observed behaviour overlapped. Conclusions Our data suggest that the hidden Markov modelling approach is successful at predicting a realistic behaviour of lions on the basis of the GPS location coordinates and the average movement rate between locations. The present study provided a unique opportunity to uncover the hidden states and compare the true behaviour with the inferred behaviour from the predicted state sequence. Implications Our results illustrated the potential of using hidden Markov models with movement rate as an input to understand carnivore behavioural patterns that could inform conservation management practices.


Author(s):  
Hai Qiu ◽  
Haitao Liao ◽  
Jay Lee

Degradation detection and recognition of degradation pattern are crucial to the successful deployment of prognostics. A machine degradation process is known to be stochastic instead of deterministic. Recognizing the degradation pattern needs helps from stochastic and probabilistic models. Among various stochastic approaches. Hidden Markov Models (HMMs) have been proven to be very effective in modeling both dynamic and static signals [1]. In this paper, aiming to providing a guideline of how to effectively and efficiently use the HMMs to assess degradation for various machinery prognostic applications, three different approaches of applying the HMMs are reviewed and compared. It demonstrates that depending on the varieties of applications, available prior knowledge, and characteristics of degradation processes, those three implementation approaches perform differently. A full understanding of the strengths and weaknesses of each deployment approach is extremely important in order to effectively utilize this powerful tool for system degradation assessment.


Sign in / Sign up

Export Citation Format

Share Document