Modern Computational Techniques for the HMMER Sequence Analysis

ISRN Bioinformatics ◽

10.1155/2013/252183 ◽

2013 ◽

Vol 2013 ◽

pp. 1-13 ◽

Cited By ~ 11

Author(s):

Xiandong Meng ◽

Yanqing Ji

Keyword(s):

Data Analysis ◽

Sequence Analysis ◽

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Hardware Acceleration ◽

Performance Comparison ◽

Computational Techniques ◽

Software And Hardware ◽

Computing Platforms

This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies.

Download Full-text

FPGA implementation of logarithmic versions of Baum-Welch and Viterbi algorithms for reduced precision hidden Markov models

Bulletin of the Polish Academy of Sciences Technical Sciences ◽

10.1515/bpasts-2017-0101 ◽

2017 ◽

Vol 65 (6) ◽

pp. 935-947

Author(s):

M. Pietras ◽

P. Klęsk

Keyword(s):

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Hardware Acceleration ◽

Performance Comparison ◽

Divide And Conquer ◽

Floating Point ◽

Logarithmic Space ◽

On Chip ◽

High Level

Abstract This paper presents a programmable system-on-chip implementation to be used for acceleration of computations within hidden Markov models. The high level synthesis (HLS) and “divide-and-conquer” approaches are presented for parallelization of Baum-Welch and Viterbi algorithms. To avoid arithmetic underflows, all computations are performed within the logarithmic space. Additionally, in order to carry out computations efficiently – i.e. directly in an FPGA system or a processor cache – we postulate to reduce the floating-point representations of HMMs. We state and prove a lemma about the length of numerically unsafe sequences for such reduced precision models. Finally, special attention is devoted to the design of a multiple logarithm and exponent approximation unit (MLEAU). Using associative mapping, this unit allows for simultaneous conversions of multiple values and thereby compensates for computational efforts of logarithmic-space operations. Design evaluation reveals absolute stall delay occurring by multiple hardware conversions to logarithms and to exponents, and furthermore the experiments evaluation reveals HMMs computation boundaries related to their probabilities and floating-point representation. The performance differences at each stage of computation are summarized in performance comparison between hardware acceleration using MLEAU and typical software implementation on an ARM or Intel processor.

Download Full-text

Hidden Markov models in biological sequence analysis

IBM Journal of Research and Development ◽

10.1147/rd.453.0449 ◽

2001 ◽

Vol 45 (3.4) ◽

pp. 449-454 ◽

Cited By ~ 43

Author(s):

E. Birney

Keyword(s):

Sequence Analysis ◽

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Biological Sequence ◽

Biological Sequence Analysis

Download Full-text

Intelligent Data Analysis to Model and Understand Live Cell Time-lapse Sequences

Methods of Information in Medicine ◽

10.3414/me11-02-0041 ◽

2012 ◽

Vol 51 (04) ◽

pp. 332-340 ◽

Cited By ~ 1

Author(s):

A. Paterson ◽

M. Ashtari ◽

D. Ribé ◽

G. Stenbeck ◽

A. Tucker

Keyword(s):

Data Analysis ◽

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Time Lapse ◽

Living Cells ◽

Live Cell ◽

Intelligent Data Analysis ◽

Stable States ◽

Experimental Conditions

SummaryBackground: One important aspect of cellular function, which is at the basis of tissue homeostasis, is the delivery of proteins to their correct destinations. Significant advances in live cell microscopy have allowed tracking of these pathways by following the dynamics of fluorescently labelled proteins in living cells.Objectives: This paper explores intelligent data analysis techniques to model the dynamic behavior of proteins in living cells as well as to classify different experimental conditions.Methods: We use a combination of decision tree classification and hidden Markov models. In particular, we introduce a novel approach to “align” hidden Markov models so that hidden states from different models can be cross-compared.Results: Our models capture the dynamics of two experimental conditions accurately with a stable hidden state for control data and multiple (less stable) states for the experimental data recapitulating the behaviour of particle trajectories within live cell time-lapse data.Conclusions: In addition to having successfully developed an automated framework for the classification of protein transport dynamics from live cell time-lapse data our model allows us to understand the dynamics of a complex trafficking pathway in living cells in culture.

Download Full-text

High Speed Biological Sequence Analysis With Hidden Markov Models on Reconfigurable Platforms

IEEE Transactions on Information Technology in Biomedicine ◽

10.1109/titb.2007.904632 ◽

2009 ◽

Vol 13 (5) ◽

pp. 740-746 ◽

Cited By ~ 13

Author(s):

T.F. Oliver ◽

B. Schmidt ◽

Y. Jakop ◽

D.L. Maskell

Keyword(s):

Sequence Analysis ◽

Hidden Markov Models ◽

High Speed ◽

Markov Models ◽

Hidden Markov ◽

Biological Sequence ◽

Biological Sequence Analysis ◽

Reconfigurable Platforms

Download Full-text

A Structural Parametrization of the Brain Using Hidden Markov Models-Based Paths in Alzheimer’s Disease

International Journal of Neural Systems ◽

10.1142/s0129065716500246 ◽

2016 ◽

Vol 26 (07) ◽

pp. 1650024 ◽

Cited By ~ 16

Author(s):

Francisco J. Martinez-Murcia ◽

Juan M. Górriz ◽

Javier Ramírez ◽

Andres Ortiz

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Feature Reduction ◽

Significant Feature ◽

Computational Techniques ◽

Occurrence Matrix ◽

The Brain

The usage of biomedical imaging in the diagnosis of dementia is increasingly widespread. A number of works explore the possibilities of computational techniques and algorithms in what is called computed aided diagnosis. Our work presents an automatic parametrization of the brain structure by means of a path generation algorithm based on hidden Markov models (HMMs). The path is traced using information of intensity and spatial orientation in each node, adapting to the structure of the brain. Each path is itself a useful way to characterize the distribution of the tissue inside the magnetic resonance imaging (MRI) image by, for example, extracting the intensity levels at each node or generating statistical information of the tissue distribution. Additionally, a further processing consisting of a modification of the grey level co-occurrence matrix (GLCM) can be used to characterize the textural changes that occur throughout the path, yielding more meaningful values that could be associated to Alzheimer’s disease (AD), as well as providing a significant feature reduction. This methodology achieves moderate performance, up to 80.3% of accuracy using a single path in differential diagnosis involving Alzheimer-affected subjects versus controls belonging to the Alzheimer’s disease neuroimaging initiative (ADNI).

Download Full-text

Multivariate longitudinal data analysis with mixed effects hidden Markov models

Biometrics ◽

10.1111/biom.12296 ◽

2015 ◽

Vol 71 (3) ◽

pp. 821-831 ◽

Cited By ~ 5

Author(s):

Jesse D. Raffa ◽

Joel A. Dubin

Keyword(s):

Data Analysis ◽

Longitudinal Data ◽

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Longitudinal Data Analysis ◽

Mixed Effects ◽

Multivariate Longitudinal Data

Download Full-text

Propositionalisation of Profile Hidden Markov Models for Biological Sequence Analysis

AI 2008: Advances in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-540-89378-3_27 ◽

2008 ◽

pp. 278-288 ◽

Cited By ~ 1

Author(s):

Stefan Mutter ◽

Bernhard Pfahringer ◽

Geoffrey Holmes

Keyword(s):

Sequence Analysis ◽

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Biological Sequence ◽

Biological Sequence Analysis ◽

Profile Hidden Markov Models

Download Full-text

Semi-supervised learning of Hidden Markov Models for biological sequence analysis

Bioinformatics ◽

10.1093/bioinformatics/bty910 ◽

2018 ◽

Vol 35 (13) ◽

pp. 2208-2215 ◽

Cited By ~ 5

Author(s):

Ioannis A Tamposis ◽

Konstantinos D Tsirigos ◽

Margarita C Theodoropoulou ◽

Panagiota I Kontou ◽

Pantelis G Bagos

Keyword(s):

Sequence Analysis ◽

Supervised Learning ◽

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Transmembrane Protein ◽

Training Data ◽

Supplementary Information ◽

Training Procedure ◽

Partially Labeled Data

Abstract Motivation Hidden Markov Models (HMMs) are probabilistic models widely used in applications in computational sequence analysis. HMMs are basically unsupervised models. However, in the most important applications, they are trained in a supervised manner. Training examples accompanied by labels corresponding to different classes are given as input and the set of parameters that maximize the joint probability of sequences and labels is estimated. A main problem with this approach is that, in the majority of the cases, labels are hard to find and thus the amount of training data is limited. On the other hand, there are plenty of unclassified (unlabeled) sequences deposited in the public databases that could potentially contribute to the training procedure. This approach is called semi-supervised learning and could be very helpful in many applications. Results We propose here, a method for semi-supervised learning of HMMs that can incorporate labeled, unlabeled and partially labeled data in a straightforward manner. The algorithm is based on a variant of the Expectation-Maximization (EM) algorithm, where the missing labels of the unlabeled or partially labeled data are considered as the missing data. We apply the algorithm to several biological problems, namely, for the prediction of transmembrane protein topology for alpha-helical and beta-barrel membrane proteins and for the prediction of archaeal signal peptides. The results are very promising, since the algorithms presented here can significantly improve the prediction performance of even the top-scoring classifiers. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Performance comparison between semicontinuous and discrete hidden Markov models of speech

Electronics Letters ◽

10.1049/el:19880099 ◽

1988 ◽

Vol 24 (3) ◽

pp. 149 ◽

Cited By ~ 9

Author(s):

X.D. Huang ◽

M.A. Jack

Keyword(s):

Hidden Markov Models ◽

Markov Models ◽

Hidden Markov ◽

Performance Comparison

Download Full-text

Fuzzy Profile Hidden Markov Models for Protein Sequence Analysis

2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology ◽

10.1109/cibcb.2005.1594950 ◽

2005 ◽

Author(s):

N.P. Bidargaddi ◽

M. Chetty ◽

J. Kamruzzaman

Keyword(s):

Sequence Analysis ◽

Hidden Markov Models ◽

Protein Sequence ◽

Markov Models ◽

Hidden Markov ◽

Protein Sequence Analysis ◽

Profile Hidden Markov Models

Download Full-text