A Bound on Modeling Error in Observable Operator Models and an Associated Learning Algorithm

Observable operator models (OOMs) generalize hidden Markov models (HMMs) and can be represented in a structurally similar matrix formalism. The mathematical theory of OOMs gives rise to a family of constructive, fast, and asymptotically correct learning algorithms, whose statistical efficiency, however, depends crucially on the optimization of two auxiliary transformation matrices. This optimization task is nontrivial; indeed, even formulating computationally accessible optimality criteria is not easy. Here we derive how a bound on the modeling error of an OOM can be expressed in terms of these auxiliary matrices, which in turn yields an optimization procedure for them and finally affords us with a complete learning algorithm: the error-controlling algorithm. Models learned by this algorithm have an assured error bound on their parameters. The performance of this algorithm is illuminated by comparisons with two types of HMMs trained by the expectation-maximization algorithm, with the efficiency-sharpening algorithm, another recently found learning algorithm for OOMs, and with predictive state representations (Littman & Sutton, 2001 ) trained by methods representing the state of the art in that field.

Download Full-text

Making the Error-Controlling Algorithm of Observable Operator Models Constructive

Neural Computation ◽

10.1162/neco.2009.10-08-878 ◽

2009 ◽

Vol 21 (12) ◽

pp. 3460-3486

Author(s):

Ming-Jie Zhao ◽

Herbert Jaeger ◽

Michael Thon

Keyword(s):

Upper Bound ◽

Markov Models ◽

Learning Algorithms ◽

Main Idea ◽

Statistical Efficiency ◽

Modeling Error ◽

Finite Dimensional ◽

Learning Speed ◽

Modeling Accuracy ◽

Order Of Magnitude

Observable operator models (OOMs) are a class of models for stochastic processes that properly subsumes the class that can be modeled by finite-dimensional hidden Markov models (HMMs). One of the main advantages of OOMs over HMMs is that they admit asymptotically correct learning algorithms. A series of learning algorithms has been developed, with increasing computational and statistical efficiency, whose recent culmination was the error-controlling (EC) algorithm developed by the first author. The EC algorithm is an iterative, asymptotically correct algorithm that yields (and minimizes) an assured upper bound on the modeling error. The run time is faster by at least one order of magnitude than EM-based HMM learning algorithms and yields significantly more accurate models than the latter. Here we present a significant improvement of the EC algorithm: the constructive error-controlling (CEC) algorithm. CEC inherits from EC the main idea of minimizing an upper bound on the modeling error but is constructive where EC needs iterations. As a consequence, we obtain further gains in learning speed without loss in modeling accuracy.

Download Full-text

MODER2: first-order Markov modeling and discovery of monomeric and dimeric binding motifs

Bioinformatics ◽

10.1093/bioinformatics/btaa045 ◽

2020 ◽

Vol 36 (9) ◽

pp. 2690-2696

Author(s):

Jarkko Toivonen ◽

Pratyush K Das ◽

Jussi Taipale ◽

Esko Ukkonen

Keyword(s):

Markov Models ◽

Expectation Maximization Algorithm ◽

Software Tool ◽

Specific Weight ◽

Training Data ◽

Supplementary Information ◽

Markov Modeling ◽

Binding Motifs ◽

The Difference ◽

Probability Matrices

Abstract Motivation Position-specific probability matrices (PPMs, also called position-specific weight matrices) have been the dominating model for transcription factor (TF)-binding motifs in DNA. There is, however, increasing recent evidence of better performance of higher order models such as Markov models of order one, also called adjacent dinucleotide matrices (ADMs). ADMs can model dependencies between adjacent nucleotides, unlike PPMs. A modeling technique and software tool that would estimate such models simultaneously both for monomers and their dimers have been missing. Results We present an ADM-based mixture model for monomeric and dimeric TF-binding motifs and an expectation maximization algorithm MODER2 for learning such models from training data and seeds. The model is a mixture that includes monomers and dimers, built from the monomers, with a description of the dimeric structure (spacing, orientation). The technique is modular, meaning that the co-operative effect of dimerization is made explicit by evaluating the difference between expected and observed models. The model is validated using HT-SELEX and generated datasets, and by comparing to some earlier PPM and ADM techniques. The ADM models explain data slightly better than PPM models for 314 tested TFs (or their DNA-binding domains) from four families (bHLH, bZIP, ETS and Homeodomain), the ADM mixture models by MODER2 being the best on average. Availability and implementation Software implementation is available from https://github.com/jttoivon/moder2. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Petro-Elastic Log-Facies Classification Using the Expectation–Maximization Algorithm and Hidden Markov Models

Mathematical Geosciences ◽

10.1007/s11004-015-9604-z ◽

2015 ◽

Vol 47 (6) ◽

pp. 719-752 ◽

Cited By ~ 16

Author(s):

David Volent Lindberg ◽

Dario Grana

Keyword(s):

Hidden Markov Models ◽

Expectation Maximization ◽

Markov Models ◽

Hidden Markov ◽

Expectation Maximization Algorithm ◽

Facies Classification

Download Full-text

THE STATIC OPTIMIZATION TASK OF OPTIMAL DESIGN OF NONLINEAR ELECTRONIC SCHEME

10.46813/2019-121-109 ◽

2019 ◽

pp. 109-115

Author(s):

Didmanidze Ibraim ◽

Donadze Mikheil

Keyword(s):

Optimal Design ◽

Optimality Criteria ◽

Pareto Optimal Solutions ◽

Optimal Solutions ◽

Design Task ◽

Optimization Task ◽

Minimum Capacity ◽

Optimal Values ◽

The Given ◽

Selection Of

The article deals with such an important selection of the elements of electronic scheme of the given conﬁguration, when the certain requirements of technical task are satisﬁed and at the same time the selected optimality criteria reach the extreme value. The gives task has been solved by the method of one-criterion optimization, in particular, the method of center gravity. To formalize the given scheme we have compiled a mathematical model of optimization, which considers the requirements of technical task. The optimal design task of the presented electronic scheme was brought to the task of multi criteria optimization. The computational experiments have been resulted in the Pareto-optimal solutions, from which there was selected a compromise on that corresponds to the minimum capacity, required by the scheme. According to the optimal values of resistors, we have conducted a computerized analysis of the transient process of the given electronic scheme with the help of a computer program Electronics Workbench.

Download Full-text

An Optimization Procedure of Model’s Base Construction in Multimodel Representation of Complex Nonlinear Systems

10.5772/intechopen.96458 ◽

2021 ◽

Author(s):

Bennasr Hichem ◽

M’Sahli Faouzi

Keyword(s):

Nonlinear Systems ◽

Learning Algorithm ◽

Batch Reactor ◽

Optimization Procedure ◽

Base Number ◽

Deterministic Approach ◽

Modeling Analysis ◽

Complex Nonlinear Systems ◽

Definition Of ◽

And Control

The multimodel approach is a research subject developed for modeling, analysis and control of complex systems. This approach supposes the definition of a set of simple models forming a model’s library. The number of models and the contribution of their validities is the main issues to consider in the multimodel approach. In this chapter, a new theoretical technique has been developed for this purpose based on a combination of probabilistic approaches with different objective function. First, the number of model is constructed using neural network and fuzzy logic. Indeed, the number of models is determined using frequency-sensitive competitive learning algorithm (FSCL) and the operating clusters are identified using Fuzzy K- means algorithm. Second, the Models’ base number is reduced. Focusing on the use of both two type of validity calculation for each model and a stochastic SVD technique is used to evaluate their contribution and permits the reduction of the Models’ base number. The combination of FSCL algorithms, K-means and the SVD technique for the proposed concept is considered as a deterministic approach discussed in this chapter has the potential to be applied to complex nonlinear systems with dynamic rapid. The recommended approach is implemented, reviewed and compared to academic benchmark and semi-batch reactor, the results in Models’ base reduction is very important witch gives a good performance in modeling.

Download Full-text

Synthetic Minority Oversampling Technique for Optimizing Classification Tasks in Botnet and Intrusion-Detection-System Datasets

Applied Sciences ◽

10.3390/app10030794 ◽

2020 ◽

Vol 10 (3) ◽

pp. 794 ◽

Cited By ~ 5

Author(s):

David Gonzalez-Cuautle ◽

Aldo Hernandez-Suarez ◽

Gabriel Sanchez-Perez ◽

Linda Karina Toscano-Medina ◽

Jose Portillo-Portillo ◽

...

Keyword(s):

Intrusion Detection ◽

Learning Algorithm ◽

Search Algorithm ◽

Detection System ◽

Optimization Procedure ◽

Algorithm Optimization ◽

Detection Systems ◽

Lack Of Information ◽

Supervised Learning Algorithms ◽

The Impact

Presently, security is a hot research topic due to the impact in daily information infrastructure. Machine-learning solutions have been improving classical detection practices, but detection tasks employ irregular amounts of data since the number of instances that represent one or several malicious samples can significantly vary. In highly unbalanced data, classification models regularly have high precision with respect to the majority class, while minority classes are considered noise due to the lack of information that they provide. Well-known datasets used for malware-based analyses like botnet attacks and Intrusion Detection Systems (IDS) mainly comprise logs, records, or network-traffic captures that do not provide an ideal source of evidence as a result of obtaining raw data. As an example, the numbers of abnormal and constant connections generated by either botnets or intruders within a network are considerably smaller than those from benign applications. In most cases, inadequate dataset design may lead to the downgrade of a learning algorithm, resulting in overfitting and poor classification rates. To address these problems, we propose a resampling method, the Synthetic Minority Oversampling Technique (SMOTE) with a grid-search algorithm optimization procedure. This work demonstrates classification-result improvements for botnet and IDS datasets by merging synthetically generated balanced data and tuning different supervised-learning algorithms.

Download Full-text

Design Optimization of Power Plants by Considering Multiple Partial Load Operation Points

Volume 6: Energy Systems: Analysis, Thermodynamics and Sustainability ◽

10.1115/imece2007-41250 ◽

2007 ◽

Cited By ~ 1

Author(s):

Marc Ju¨des ◽

George Tsatsaronis

Keyword(s):

Design Optimization ◽

Power Plants ◽

Optimization Procedure ◽

Optimization Methods ◽

Mixed Integer ◽

Optimization Approach ◽

Optimization Task ◽

Operation Conditions ◽

Partial Load ◽

Conventional Optimization

The design optimization of complex energy conversion systems requires the consideration of typical operation conditions. Due to the complex optimization task, conventional optimization methods normally take into account only one operation point that is, in the majority of cases, the full load case. To guarantee good operation at partial loads additional operation conditions have to be taken into account during the optimization procedure. The optimization task described in this article considers altogether four different operation points of a cogeneration plant. Modelling requirements, such as the equations that describe the partial load behavior of single components, are described as well as the problems that occur, when nonlinear and nonconvex equations are used. For the solution of the resulting non-convex mixed-integer nonlinear programming (MINLP) problem, the solver LaGO is used, which requires that the optimization problem is formulated in GAMS. The results of the conventional optimization approach are compared to the results of the new method. It is shown, that without consideration of different operation points, a flexible operation of the plant may be impossible.

Download Full-text

Relating the Slope of the Activation Function and the Learning Rate Within a Recurrent Neural Network

Neural Computation ◽

10.1162/089976699300016340 ◽

1999 ◽

Vol 11 (5) ◽

pp. 1069-1077 ◽

Cited By ~ 28

Author(s):

Danilo P. Mandic ◽

Jonathon A. Chambers

Keyword(s):

Neural Network ◽

Neural Networks ◽

Recurrent Neural Network ◽

Recurrent Neural Networks ◽

Degrees Of Freedom ◽

Learning Algorithm ◽

Activation Function ◽

Learning Rate ◽

Optimization Task ◽

Nonlinear Activation Function

A relationship between the learning rate η in the learning algorithm, and the slope β in the nonlinear activation function, for a class of recurrent neural networks (RNNs) trained by the real-time recurrent learning algorithm is provided. It is shown that an arbitrary RNN can be obtained via the referent RNN, with some deterministic rules imposed on its weights and the learning rate. Such relationships reduce the number of degrees of freedom when solving the nonlinear optimization task of finding the optimal RNN parameters.

Download Full-text

Efficient and Effective Learning of HMMs Based on Identification of Hidden States

Mathematical Problems in Engineering ◽

10.1155/2017/7318940 ◽

2017 ◽

Vol 2017 ◽

pp. 1-26 ◽

Cited By ~ 1

Author(s):

Tingting Liu ◽

Jan Lemeire

Keyword(s):

Markov Models ◽

Learning Algorithm ◽

High Impact ◽

Identification Accuracy ◽

Model Parameters ◽

Local Optimum ◽

Correct Identification ◽

Suitable Model ◽

Search Heuristics ◽

Hidden States

The predominant learning algorithm for Hidden Markov Models (HMMs) is local search heuristics, of which the Baum-Welch (BW) algorithm is mostly used. It is an iterative learning procedure starting with a predefined size of state spaces and randomly chosen initial parameters. However, wrongly chosen initial parameters may cause the risk of falling into a local optimum and a low convergence speed. To overcome these drawbacks, we propose to use a more suitable model initialization approach, a Segmentation-Clustering and Transient analysis (SCT) framework, to estimate the number of states and model parameters directly from the input data. Based on an analysis of the information flow through HMMs, we demystify the structure of models and show that high-impact states are directly identifiable from the properties of observation sequences. States having a high impact on the log-likelihood make HMMs highly specific. Experimental results show that even though the identification accuracy drops to 87.9% when random models are considered, the SCT method is around 50 to 260 times faster than the BW algorithm with 100% correct identification for highly specific models whose specificity is greater than 0.06.

Download Full-text

Automatic detection of avalanches using a combined array classification and localization

10.5194/esurf-2018-36 ◽

2018 ◽

Author(s):

Matthias Heck ◽

Alec van Herwijnen ◽

Conny Hammer ◽

Manuel Hobiger ◽

Jürg Schweizer ◽

...

Keyword(s):

Seismic Data ◽

Markov Models ◽

Learning Algorithm ◽

Processing Technique ◽

Second Step ◽

Winter Period ◽

Snow Avalanches ◽

Field Site ◽

New Approach ◽

Training Event

Abstract. We use a seismic monitoring system to automatically determine the avalanche activity at a remote field site near Davos, Switzerland. By using a recently developed approach based on hidden Markov models (HMMs), a machine learning algorithm, we were able to automatically identify avalanches in continuous seismic data by providing as little as one single training event. Furthermore, we implemented an operational method to provide near real-time classification results. For the 2016–2017 winter period 117 events were automatically identified. False classified events such as airplanes and local earthquakes were filtered using a new approach containing two additional classification steps. In a first step, we implemented a second HMM based classifier at a second array 14 km away to automatically identify airplanes and earthquakes. By cross-checking the results of both arrays we reduced the amount of false classifications by about 50 %. In a second step, we used multiple signal classifications (MUSIC), an array processing technique to determine the direction of the source. Although avalanche events have a moving source character only small changes of the source direction are common for snow avalanches whereas false classifications had large changes in the source direction and were therefore dismissed. From the 117 detected events during the 4 month period we were able to identify 90 false classifications based on these two additional steps. The obtained avalanche activity based on the remaining 27 avalanche events was in line with visual observations performed in the area of Davos.

Download Full-text