scholarly journals Modern Computational Techniques for the HMMER Sequence Analysis

2013 ◽  
Vol 2013 ◽  
pp. 1-13 ◽  
Author(s):  
Xiandong Meng ◽  
Yanqing Ji

This paper focuses on the latest research and critical reviews on modern computing architectures, software and hardware accelerated algorithms for bioinformatics data analysis with an emphasis on one of the most important sequence analysis applications—hidden Markov models (HMM). We show the detailed performance comparison of sequence analysis tools on various computing platforms recently developed in the bioinformatics society. The characteristics of the sequence analysis, such as data and compute-intensive natures, make it very attractive to optimize and parallelize by using both traditional software approach and innovated hardware acceleration technologies.

2017 ◽  
Vol 65 (6) ◽  
pp. 935-947
Author(s):  
M. Pietras ◽  
P. Klęsk

Abstract This paper presents a programmable system-on-chip implementation to be used for acceleration of computations within hidden Markov models. The high level synthesis (HLS) and “divide-and-conquer” approaches are presented for parallelization of Baum-Welch and Viterbi algorithms. To avoid arithmetic underflows, all computations are performed within the logarithmic space. Additionally, in order to carry out computations efficiently – i.e. directly in an FPGA system or a processor cache – we postulate to reduce the floating-point representations of HMMs. We state and prove a lemma about the length of numerically unsafe sequences for such reduced precision models. Finally, special attention is devoted to the design of a multiple logarithm and exponent approximation unit (MLEAU). Using associative mapping, this unit allows for simultaneous conversions of multiple values and thereby compensates for computational efforts of logarithmic-space operations. Design evaluation reveals absolute stall delay occurring by multiple hardware conversions to logarithms and to exponents, and furthermore the experiments evaluation reveals HMMs computation boundaries related to their probabilities and floating-point representation. The performance differences at each stage of computation are summarized in performance comparison between hardware acceleration using MLEAU and typical software implementation on an ARM or Intel processor.


2012 ◽  
Vol 51 (04) ◽  
pp. 332-340 ◽  
Author(s):  
A. Paterson ◽  
M. Ashtari ◽  
D. Ribé ◽  
G. Stenbeck ◽  
A. Tucker

SummaryBackground: One important aspect of cellular function, which is at the basis of tissue homeostasis, is the delivery of proteins to their correct destinations. Significant advances in live cell microscopy have allowed tracking of these pathways by following the dynamics of fluorescently labelled proteins in living cells.Objectives: This paper explores intelligent data analysis techniques to model the dynamic behavior of proteins in living cells as well as to classify different experimental conditions.Methods: We use a combination of decision tree classification and hidden Markov models. In particular, we introduce a novel approach to “align” hidden Markov models so that hidden states from different models can be cross-compared.Results: Our models capture the dynamics of two experimental conditions accurately with a stable hidden state for control data and multiple (less stable) states for the experimental data recapitulating the behaviour of particle trajectories within live cell time-lapse data.Conclusions: In addition to having successfully developed an automated framework for the classification of protein transport dynamics from live cell time-lapse data our model allows us to understand the dynamics of a complex trafficking pathway in living cells in culture.


2016 ◽  
Vol 26 (07) ◽  
pp. 1650024 ◽  
Author(s):  
Francisco J. Martinez-Murcia ◽  
Juan M. Górriz ◽  
Javier Ramírez ◽  
Andres Ortiz

The usage of biomedical imaging in the diagnosis of dementia is increasingly widespread. A number of works explore the possibilities of computational techniques and algorithms in what is called computed aided diagnosis. Our work presents an automatic parametrization of the brain structure by means of a path generation algorithm based on hidden Markov models (HMMs). The path is traced using information of intensity and spatial orientation in each node, adapting to the structure of the brain. Each path is itself a useful way to characterize the distribution of the tissue inside the magnetic resonance imaging (MRI) image by, for example, extracting the intensity levels at each node or generating statistical information of the tissue distribution. Additionally, a further processing consisting of a modification of the grey level co-occurrence matrix (GLCM) can be used to characterize the textural changes that occur throughout the path, yielding more meaningful values that could be associated to Alzheimer’s disease (AD), as well as providing a significant feature reduction. This methodology achieves moderate performance, up to 80.3% of accuracy using a single path in differential diagnosis involving Alzheimer-affected subjects versus controls belonging to the Alzheimer’s disease neuroimaging initiative (ADNI).


2018 ◽  
Vol 35 (13) ◽  
pp. 2208-2215 ◽  
Author(s):  
Ioannis A Tamposis ◽  
Konstantinos D Tsirigos ◽  
Margarita C Theodoropoulou ◽  
Panagiota I Kontou ◽  
Pantelis G Bagos

Abstract Motivation Hidden Markov Models (HMMs) are probabilistic models widely used in applications in computational sequence analysis. HMMs are basically unsupervised models. However, in the most important applications, they are trained in a supervised manner. Training examples accompanied by labels corresponding to different classes are given as input and the set of parameters that maximize the joint probability of sequences and labels is estimated. A main problem with this approach is that, in the majority of the cases, labels are hard to find and thus the amount of training data is limited. On the other hand, there are plenty of unclassified (unlabeled) sequences deposited in the public databases that could potentially contribute to the training procedure. This approach is called semi-supervised learning and could be very helpful in many applications. Results We propose here, a method for semi-supervised learning of HMMs that can incorporate labeled, unlabeled and partially labeled data in a straightforward manner. The algorithm is based on a variant of the Expectation-Maximization (EM) algorithm, where the missing labels of the unlabeled or partially labeled data are considered as the missing data. We apply the algorithm to several biological problems, namely, for the prediction of transmembrane protein topology for alpha-helical and beta-barrel membrane proteins and for the prediction of archaeal signal peptides. The results are very promising, since the algorithms presented here can significantly improve the prediction performance of even the top-scoring classifiers. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document