scholarly journals A Hidden Markov Model for Investigating Recent Positive Selection through Haplotype Structure

2014 ◽  
Author(s):  
Hua Chen ◽  
Jody Hey ◽  
Montgomery Slatkin

Recent positive selection can increase the frequency of an advantageous mutant rapidly enough that a relatively long ancestral haplotype will be remained intact around it. We present a hidden Markov model (HMM) to identify such haplotype structures. With HMM identified haplotype structures, a population genetic model for the extent of ancestral haplotypes is then adopted for parameter inference of the selection intensity and the allele age. Simulations show that this method can detect selection under a wide range of conditions and has higher power than the existing frequency spectrum-based method. In addition, it provides good estimate of the selection coefficients and allele ages for strong selection. The method analyzes large data sets in a reasonable amount of running time. This method is applied to HapMap III data for a genome scan, and identifies a list of candidate regions putatively under recent positive selection. It is also applied to several genes known to be under recent positive selection, including the LCT, KITLG and TYRP1 genes in Northern Europeans, and OCA2 in East Asians, to estimate their allele ages and selection coefficients.

2018 ◽  
Vol 8 (12) ◽  
pp. 2421 ◽  
Author(s):  
Chongya Song ◽  
Alexander Pons ◽  
Kang Yen

In the field of network intrusion, malware usually evades anomaly detection by disguising malicious behavior as legitimate access. Therefore, detecting these attacks from network traffic has become a challenge in this an adversarial setting. In this paper, an enhanced Hidden Markov Model, called the Anti-Adversarial Hidden Markov Model (AA-HMM), is proposed to effectively detect evasion pattern, using the Dynamic Window and Threshold techniques to achieve adaptive, anti-adversarial, and online-learning abilities. In addition, a concept called Pattern Entropy is defined and acts as the foundation of AA-HMM. We evaluate the effectiveness of our approach employing two well-known benchmark data sets, NSL-KDD and CTU-13, in terms of the common performance metrics and the algorithm’s adaptation and anti-adversary abilities.


2018 ◽  
Vol 7 (2.32) ◽  
pp. 153
Author(s):  
N Arunachalam ◽  
P Prabavathy ◽  
S Priyatharshini

Credit card fake detection has raised unique challenges due to the streaming, imbalanced, and non-stationary nature of the data that has been transacted. It had additionally included an active learning step, since the labeling (fake or genuine) use of a subset on transactions is obtained in near-real time through human investigators contacted the cardholders. In this paper, the Hidden Markov Model (HMM) algorithm has been used for sequence of Credit card operations for transaction processing and the fake can be detected by using the fake detection model during transaction processing. HMM, Fake detection model and image process had played an imperative role in the detection of credit card fake in online transactions. In fake detection, most challenging is a data problem, due to two major reasons – first, the profiles of cardholders are normal and fake lent behaviors changed constantly and secondly, credit card fake data sets are highly changed its position. Using fake detection (FD) algorithm the performance of detection in credit card transactions had highly affected by the sampling approach on dataset, selection of HMM, Fake detection model. Using fake detection (FD) algorithm an image technique had been used. A reliable augmentation of the target scarce population of fakes are  important considering issues such as labeling cost; algorithm HMM, fake detection and outlines in the data streamed source. We have approached several scenarios which showed the feasibility of improving detection capabilities evaluated by means of receiver operating characteristic (ROC) curves and several key performance indicators (KPI) commonly used in financial business.  


The inconsistency is a major problem in security of information in computer is two ways: data inconsistency and application inconsistency. These two problems are raised due to bad structure of design in programming and create security breaches, vulnerable entries by exploiting application codes. So we can discover these anomalies by design of anomaly detection system (ADS) models at system programming (coding) levels with the help of machine learning. The security vulnerabilities (anomalies) are frequently occurred at potential code execution by exploitation or manipulation of instructions. So, in this paper we have specified various forms of extensions to our work to detect wide range of anomalies at coding exploits and use of a machine learning technique called Context Sensitive-Hidden Markov Model (CS-HMM) will improve the overall performance of ADS by discovering the correlations between control data instances. In this paper we are going to use Linux OS tracing kits to collect the necessary information such as control data instances (return addresses) collected from system as part of artificial learning. The results evaluated through practice on various programs developed for work and also uses of some Linux commands for tracing, finally compared performance of all those input datasets generated live (artificially). After that, the CS-HMM is applying to datasets to scrutinize the anomalies with similarity-search and correlation of function control data of program and classification process determines the anomalous outcomes.


2008 ◽  
Vol 06 (02) ◽  
pp. 387-401 ◽  
Author(s):  
ZOI I. LITOU ◽  
PANTELIS G. BAGOS ◽  
KONSTANTINOS D. TSIRIGOS ◽  
THEODORE D. LIAKOPOULOS ◽  
STAVROS J. HAMODRAKAS

Surface proteins in Gram-positive bacteria are frequently implicated in virulence. We have focused on a group of extracellular cell wall-attached proteins (CWPs), containing an LPXTG motif for cleavage and covalent coupling to peptidoglycan by sortase enzymes. A hidden Markov model (HMM) approach for predicting the LPXTG-anchored cell wall proteins of Gram-positive bacteria was developed and compared against existing methods. The HMM model is parsimonious in terms of the number of freely estimated parameters, and it has proved to be very sensitive and specific in a training set of 55 experimentally verified LPXTG-anchored cell wall proteins as well as in reliable data sets of globular and transmembrane proteins. In order to identify such proteins in Gram-positive bacteria, a comprehensive analysis of 94 completely sequenced genomes has been performed. We identified, in total, 860 LPXTG-anchored cell wall proteins, a number that is significantly higher compared to those obtained by other available methods. Of these proteins, 237 are hypothetical proteins according to the annotation of SwissProt, and 88 had no homologs in the SwissProt database — this might be evidence that they are members of newly identified families of CWPs. The prediction tool, the database with the proteins identified in the genomes, and supplementary material are available online at .


Sign in / Sign up

Export Citation Format

Share Document