scholarly journals Probabilistic Deterministic Finite Automata and Recurrent Networks, Revisited

Entropy ◽  
2022 ◽  
Vol 24 (1) ◽  
pp. 90
Author(s):  
Sarah E. Marzen ◽  
James P. Crutchfield

Reservoir computers (RCs) and recurrent neural networks (RNNs) can mimic any finite-state automaton in theory, and some workers demonstrated that this can hold in practice. We test the capability of generalized linear models, RCs, and Long Short-Term Memory (LSTM) RNN architectures to predict the stochastic processes generated by a large suite of probabilistic deterministic finite-state automata (PDFA) in the small-data limit according to two metrics: predictive accuracy and distance to a predictive rate-distortion curve. The latter provides a sense of whether or not the RNN is a lossy predictive feature extractor in the information-theoretic sense. PDFAs provide an excellent performance benchmark in that they can be systematically enumerated, the randomness and correlation structure of their generated processes are exactly known, and their optimal memory-limited predictors are easily computed. With less data than is needed to make a good prediction, LSTMs surprisingly lose at predictive accuracy, but win at lossy predictive feature extraction. These results highlight the utility of causal states in understanding the capabilities of RNNs to predict.

2009 ◽  
Vol 30 (5) ◽  
pp. 1343-1369 ◽  
Author(s):  
DANNY CALEGARI ◽  
KOJI FUJIWARA

AbstractA function on a discrete group is weakly combable if its discrete derivative with respect to a combing can be calculated by a finite-state automaton. A weakly combable function is bicombable if it is Lipschitz in both the left- and right-invariant word metrics. Examples of bicombable functions on word-hyperbolic groups include:(1)homomorphisms to ℤ;(2)word length with respect to a finite generating set;(3)most known explicit constructions of quasimorphisms (e.g. the Epstein–Fujiwara counting quasimorphisms).We show that bicombable functions on word-hyperbolic groups satisfy acentral limit theorem: if$\overline {\phi }_n$is the value of ϕ on a random element of word lengthn(in a certain sense), there areEandσfor which there is convergence in the sense of distribution$n^{-1/2}(\overline {\phi }_n - nE) \to N(0,\sigma )$, whereN(0,σ) denotes the normal distribution with standard deviationσ. As a corollary, we show that ifS1andS2are any two finite generating sets forG, there is an algebraic numberλ1,2depending onS1andS2such that almost every word of lengthnin theS1metric has word lengthn⋅λ1,2in theS2metric, with error of size$O(\sqrt {n})$.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Menelaos Pavlou ◽  
Gareth Ambler ◽  
Rumana Z. Omar

Abstract Background Clustered data arise in research when patients are clustered within larger units. Generalised Estimating Equations (GEE) and Generalised Linear Models (GLMM) can be used to provide marginal and cluster-specific inference and predictions, respectively. Methods Confounding by Cluster (CBC) and Informative cluster size (ICS) are two complications that may arise when modelling clustered data. CBC can arise when the distribution of a predictor variable (termed ‘exposure’), varies between clusters causing confounding of the exposure-outcome relationship. ICS means that the cluster size conditional on covariates is not independent of the outcome. In both situations, standard GEE and GLMM may provide biased or misleading inference, and modifications have been proposed. However, both CBC and ICS are routinely overlooked in the context of risk prediction, and their impact on the predictive ability of the models has been little explored. We study the effect of CBC and ICS on the predictive ability of risk models for binary outcomes when GEE and GLMM are used. We examine whether two simple approaches to handle CBC and ICS, which involve adjusting for the cluster mean of the exposure and the cluster size, respectively, can improve the accuracy of predictions. Results Both CBC and ICS can be viewed as violations of the assumptions in the standard GLMM; the random effects are correlated with exposure for CBC and cluster size for ICS. Based on these principles, we simulated data subject to CBC/ICS. The simulation studies suggested that the predictive ability of models derived from using standard GLMM and GEE ignoring CBC/ICS was affected. Marginal predictions were found to be mis-calibrated. Adjusting for the cluster-mean of the exposure or the cluster size improved calibration, discrimination and the overall predictive accuracy of marginal predictions, by explaining part of the between cluster variability. The presence of CBC/ICS did not affect the accuracy of conditional predictions. We illustrate these concepts using real data from a multicentre study with potential CBC. Conclusion Ignoring CBC and ICS when developing prediction models for clustered data can affect the accuracy of marginal predictions. Adjusting for the cluster mean of the exposure or the cluster size can improve the predictive accuracy of marginal predictions.


2007 ◽  
Vol 18 (04) ◽  
pp. 859-871
Author(s):  
MARTIN ŠIMŮNEK ◽  
BOŘIVOJ MELICHAR

A border of a string is a prefix of the string that is simultaneously its suffix. It is one of the basic stringology keystones used as a part of many algorithms in pattern matching, molecular biology, computer-assisted music analysis and others. The paper offers the automata-theoretical description of Iliopoulos's ALL_BORDERS algorithm. The algorithm finds all borders of a string with don't care symbols. We show that ALL_BORDERS algorithm is an implementation of a finite state transducer of specific form. We describe how such a transducer can be constructed and what should be the input string like. The described transducer finds a set of lengths of all borders. Last but not least, we define approximate borders and show how to find all approximate borders of a string when we concern Hamming distance definition. Our solution of this problem is based on transducers again. This allows us to use analogy with automata-based pattern matching methods. Finally we discuss conditions under which the same principle can be used for other distance measures.


2017 ◽  
Author(s):  
Venkatesh Elango ◽  
Aashish N Patel ◽  
Kai J Miller ◽  
Vikash Gilja

AbstractA fundamental challenge in designing brain-computer interfaces (BCIs) is decoding behavior from time-varying neural oscillations. in typical applications, decoders are constructed for individual subjects and with limited data leading to restrictions on the types of models that can be utilized. currently, the best performing decoders are typically linear models capable of utilizing rigid timing constraints with limited training data. Here we demonstrate the use of Long Short-Term Memory (LSTM) networks to take advantage of the temporal information present in sequential neural data collected from subjects implanted with electrocorticographic (ECoG) electrode arrays performing a finger flexion task. our constructed models are capable of achieving accuracies that are comparable to existing techniques while also being robust to variation in sample data size. Moreover, we utilize the LSTM networks and an affine transformation layer to construct a novel architecture for transfer learning. We demonstrate that in scenarios where only the affine transform is learned for a new subject, it is possible to achieve results comparable to existing state-of-the-art techniques. The notable advantage is the increased stability of the model during training on novel subjects. Relaxing the constraint of only training the affine transformation, we establish our model as capable of exceeding performance of current models across all training data sizes. Overall, this work demonstrates that LSTMS are a versatile model that can accurately capture temporal patterns in neural data and can provide a foundation for transfer learning in neural decoding.


2018 ◽  
Vol 2 (1) ◽  
pp. 75-85
Author(s):  
Rouly Doharma Sihite ◽  
Aditya Wikan Mahastama

Transliteration is still a challenge in helping people to read or write from one to another writing systems. Korean transliteration has been a topic of research to automate the conversion between Hangul (Korean writing system) and Latin characters. Previous works have been done in transliterating Hangul to Latin, using statistical approach (72.2% accuracy) and Extended Markov Models (54.9% accuracy). This research focus on transliterating Latin (romanised) Korean words into Hangul, as many learners of Korean began using Latin first. Selected method is modeling the probable vowel and consonant forms and problable vowel and consonant sequences using Finite State Automata to avoid training. These models are then coded into rules which applied and tested to 100 random Korean words. Initial test results only 40% success rate in transliterating due to the nature that consonants have to be labeled as initial or final of a syllable, and some consonants missed the modeled rules. Additional rules are then added to catch-up and merge these consonants into existing proper syllables, which increased the success rate to 92%. This result is analysed further and it is found that certain consonants sequence caused syllabification problem if exist in a certain position. Other additional rules was inserted and yields 99% final success rate which also is the accuracy of transliterating Korean words written in Latin into Hangul characters in compund syllables.


2017 ◽  
Vol 2017 ◽  
pp. 1-33 ◽  
Author(s):  
Weijun Zhu ◽  
Changwei Feng ◽  
Huanmei Wu

As an important complex problem, the temporal logic model checking problem is still far from being fully resolved under the circumstance of DNA computing, especially Computation Tree Logic (CTL), Interval Temporal Logic (ITL), and Projection Temporal Logic (PTL), because there is still a lack of approaches for DNA model checking. To address this challenge, a model checking method is proposed for checking the basic formulas in the above three temporal logic types with DNA molecules. First, one-type single-stranded DNA molecules are employed to encode the Finite State Automaton (FSA) model of the given basic formula so that a sticker automaton is obtained. On the other hand, other single-stranded DNA molecules are employed to encode the given system model so that the input strings of the sticker automaton are obtained. Next, a series of biochemical reactions are conducted between the above two types of single-stranded DNA molecules. It can then be decided whether the system satisfies the formula or not. As a result, we have developed a DNA-based approach for checking all the basic formulas of CTL, ITL, and PTL. The simulated results demonstrate the effectiveness of the new method.


Sign in / Sign up

Export Citation Format

Share Document