scholarly journals Detection of structural variations in densely-labelled optical DNA barcodes: A hidden Markov model approach

PLoS ONE ◽  
2021 ◽  
Vol 16 (11) ◽  
pp. e0259670
Author(s):  
Albertas Dvirnas ◽  
Callum Stewart ◽  
Vilhelm Müller ◽  
Santosh Kumar Bikkarolla ◽  
Karolin Frykholm ◽  
...  

Large-scale genomic alterations play an important role in disease, gene expression, and chromosome evolution. Optical DNA mapping (ODM), commonly categorized into sparsely-labelled ODM and densely-labelled ODM, provides sequence-specific continuous intensity profiles (DNA barcodes) along single DNA molecules and is a technique well-suited for detecting such alterations. For sparsely-labelled barcodes, the possibility to detect large genomic alterations has been investigated extensively, while densely-labelled barcodes have not received as much attention. In this work, we introduce HMMSV, a hidden Markov model (HMM) based algorithm for detecting structural variations (SVs) directly in densely-labelled barcodes without access to sequence information. We evaluate our approach using simulated data-sets with 5 different types of SVs, and combinations thereof, and demonstrate that the method reaches a true positive rate greater than 80% for randomly generated barcodes with single variations of size 25 kilobases (kb). Increasing the length of the SV further leads to larger true positive rates. For a real data-set with experimental barcodes on bacterial plasmids, we successfully detect matching barcode pairs and SVs without any particular assumption of the types of SVs present. Instead, our method effectively goes through all possible combinations of SVs. Since ODM works on length scales typically not reachable with other techniques, our methodology is a promising tool for identifying arbitrary combinations of genomic alterations.

2016 ◽  
Author(s):  
Hong Gao ◽  
Hua Tang ◽  
Carlos Bustamante

With the rapid production of high dimensional genetic data, one major challenge in genome-wide association studies is to develop effective and efficient statistical tools to resolve the low power problem of detecting causal SNPs with low to moderate susceptibility, whose effects are often obscured by substantial background noises. Here we present a novel method that serves as an optimal technique for reducing background noises and improving detection power in genome-wide association studies. The approach uses hidden Markov model and its derivate Markov hidden Markov model to estimate the posterior probabilities of a markers being in an associated state. We conducted extensive simulations based on the human whole genome genotype data from the GlaxoSmithKline-POPRES project to calibrate the sensitivity and specificity of our method and compared with many popular approaches for detecting positive signals including the χ^2 test for association and the Cochran-Armitage trend test. Our simulation results suggested that at very low false positive rates (<10^-6), our method reaches the power of 0.9, and is more powerful than any other approaches, when the allelic effect of the causal variant is non-additive or unknown. Application of our method to the data set generated by Welcome Trust Case Control Consortium using 14,000 cases and 3,000 controls confirmed its powerfulness and efficiency under the context of the large-scale genome-wide association studies.


2013 ◽  
Vol 4 (1) ◽  
pp. 81-102 ◽  
Author(s):  
Arindam Kar ◽  
Debotosh Bhattacharjee ◽  
Mita Nasipuri ◽  
Dipak Kumar Basu ◽  
Mahantapas Kundu

This paper introduces a novel methodology that combines the multi-resolution feature of the Gabor wavelet transformation (GWT) with the local interactions of the facial structures expressed through the Pseudo Hidden Markov Model (PHMM). Unlike the traditional zigzag scanning method for feature extraction a continuous scanning method from top-left corner to right then top-down and right to left and so on until right-bottom of the image i.e., a spiral scanning technique has been proposed for better feature selection. Unlike traditional HMMs, the proposed PHMM does not perform the state conditional independence of the visible observation sequence assumption. This is achieved via the concept of local structures introduced by the PHMM used to extract facial bands and automatically select the most informative features of a face image. Thus, the long-range dependency problem inherent to traditional HMMs has been drastically reduced. Again with the use of most informative pixels rather than the whole image makes the proposed method reasonably faster for face recognition. This method has been successfully tested on frontal face images from the ORL, FRAV2D, and FERET face databases where the images vary in pose, illumination, expression, and scale. The FERET data set contains 2200 frontal face images of 200 subjects, while the FRAV2D data set consists of 1100 images of 100 subjects and the full ORL database is considered. The results reported in this application are far better than the recent and most referred systems.


Author(s):  
Drinold Mbete ◽  
Kennedy Nyongesa ◽  
Joseph Rotich

Clinical study of malaria presents a modeling challenge as patients disease status and progress is partially observed and assessed at discrete clinic visit times. Since patients initiate visits based on symptoms, intense research has focused on identication of reliable prediction for exposure, susceptibility to infection and development of severe malaria complications. Despite detailed literature on malaria infection and transmission, very little has been documented in the existing literature on malaria symptoms modeling, yet these symptoms are common. Furthermore, imperfect diagnostic tests may yield misclassication of observed symptoms. Place and Duration of Study: The main objective of this study is to develop a Bayesian Hidden Markov Model of Malaria symptoms in Masinde Muliro University of Science and Technology student population. An expression of Hidden Markov Model is developed and the parameters estimated through the forward-backward algorithm.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Yanjiao Chen

Music multimedia is one of the more popular types of digital music. This article is based on the hidden Markov model (HMM) and proposed this kind of music multimedia automatic classification method. The method not only analyzes the characteristics of traditional music in detail but also fully considers the important characteristics of other music. At the same time, it uses bagging to train two groups of HMMs and automatically classifies them to achieve a better classification effect. This paper optimizes the variable parameters from different aspects such as model structure, data form, and model change to obtain the optimal HMM parameter value. This method not only considers the prior knowledge of feature words, word frequency, and number of documents but also fuses the meaning of the feature words into the hidden Markov classification model. Finally, by testing the hidden Markov model used in this paper on the music multimedia data set, the experimental results show that the method in this paper can effectively perform automatic classification according to the melody characteristics of music multimedia.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Shuo Shi ◽  
Shuting Xi ◽  
Sang-Bing Tsai

Accompaniment production is one of the most important elements in music work, and chord arrangement is the key link of accompaniment production, which usually requires more musical talent and profound music theory knowledge to be competent. In this article, the machine learning model is used to replace manual accompaniment chords’ arrangement, and an automatic computer means is provided to complete and assist accompaniment chords’ arrangement. Also, through music feature extraction, automatic chord label construction, and model construction and training, the whole system finally has the ability of automatic accompaniment chord arrangement for the main melody. Based on the research of automatic chord label construction method and the characteristics of MIDI data format, a chord analysis method based on interval difference is proposed to construct chord labels of the whole track and realize the construction of automatic chord labels. In this study, the hidden Markov model is constructed according to the chord types, in which the input features are the improved theme PCP features proposed in this paper, and the input labels are the label data set constructed by the automated method proposed in this paper. After the training is completed, the PCP features of the theme to be predicted and improved are input to generate the accompaniment chords of the final arrangement. Through PCP features and template-matching model, the system designed in this paper improves the matching accuracy of the generated chords compared with that generated by the traditional method.


Credit card fraud introduces to the physical loss of a credit card or the destruction of sensitive credit card data. Several text mining procedures can be used for disclosure. This investigation reveals several algorithms that can be used to analyze transactions as a fraud or as a real background. This paper represents the possibility of fraudulent transactions in the prevalence and meaning of credit card usage also, Credit card fraud data collection was used in the investigation. Since the dataset was largely unbalanced, SMOTE (Synthetic Minority oversampling Technique) is applying for an overdose. In addition, jobs selected, and the data set divided into two parts, training data and test data. In this paper, The Advanced Super Gradient Boostingbased Text mining Algorithm (ASGB) suggested to detect the fraud transaction in Credit card transactions. ASGB is a Decision-Tree-Based Ensemble Text mining algorithm that utilizes a gradient boosting framework. In forecast difficulties, including unstructured data (Images, Text, etc.), artificial neural networks tend to exceed all other algorithms or structures. The proposed algorithms used in the experiment were the Hidden Markov Model, Random Forest, Gradient Boosting, and Enhanced Hidden Markov Model. The Experimental Results show that proposed algorithms, a welltuned ASGB classifier outperforms all of them. And it presents better Precision is 99.1%, and Recall is 99.8%, F-measure is 99.5%.


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Li Liu ◽  
Dashi Luo ◽  
Ming Liu ◽  
Jun Zhong ◽  
Ye Wei ◽  
...  

Microblogging is increasingly becoming one of the most popular online social media for people to express ideas and emotions. The amount of socially generated content from this medium is enormous. Text mining techniques have been intensively applied to discover the hidden knowledge and emotions from this huge dataset. In this paper, we propose a modified version of hidden Markov model (HMM) classifier, called self-adaptive HMM, whose parameters are optimized by Particle Swarm Optimization algorithms. Since manually labeling large-scale dataset is difficult, we also employ the entropy to decide whether a new unlabeled tweet shall be contained in the training dataset after being assigned an emotion using our HMM-based approach. In the experiment, we collected about 200,000 Chinese tweets from Sina Weibo. The results show that theF-score of our approach gets 76% on happiness and fear and 65% on anger, surprise, and sadness. In addition, the self-adaptive HMM classifier outperforms Naive Bayes and Support Vector Machine on recognition of happiness, anger, and sadness.


Sign in / Sign up

Export Citation Format

Share Document