Maximum likelihood training of neural networks

On maximum likelihood fuzzy neural networks

Fuzzy Sets and Systems ◽

10.1016/j.fss.2010.06.003 ◽

2010 ◽

Vol 161 (21) ◽

pp. 2795-2807 ◽

Cited By ~ 8

Author(s):

Hsu-Kun Wu ◽

Jer-Guang Hsieh ◽

Yih-Lon Lin ◽

Jyh-Horng Jeng

Keyword(s):

Neural Networks ◽

Maximum Likelihood ◽

Fuzzy Neural Networks ◽

Fuzzy Neural

Download Full-text

Distinguishing Felsenstein Zone from Farris Zone Using Neural Networks

Molecular Biology and Evolution ◽

10.1093/molbev/msaa164 ◽

2020 ◽

Vol 37 (12) ◽

pp. 3632-3641

Author(s):

Alina F Leuchtenberger ◽

Stephen M Crotty ◽

Tamara Drucks ◽

Heiko A Schmidt ◽

Sebastian Burgstaller-Muehlbacher ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Maximum Likelihood ◽

Phylogenetic Tree ◽

Maximum Parsimony ◽

Reconstruction Method ◽

Long Branch Attraction ◽

Tree Reconstruction ◽

Phylogenetic Tree Reconstruction ◽

The Neural Network

Abstract Maximum likelihood and maximum parsimony are two key methods for phylogenetic tree reconstruction. Under certain conditions, each of these two methods can perform more or less efficiently, resulting in unresolved or disputed phylogenies. We show that a neural network can distinguish between four-taxon alignments that were evolved under conditions susceptible to either long-branch attraction or long-branch repulsion. When likelihood and parsimony methods are discordant, the neural network can provide insight as to which tree reconstruction method is best suited to the alignment. When applied to the contentious case of Strepsiptera evolution, our method shows robust support for the current scientific view, that is, it places Strepsiptera with beetles, distant from flies.

Download Full-text

Maximum Likelihood Nonlinear Transformations Based on Deep Neural Networks

IEEE/ACM Transactions on Audio Speech and Language Processing ◽

10.1109/taslp.2016.2594255 ◽

2016 ◽

Vol 24 (11) ◽

pp. 2023-2031 ◽

Cited By ~ 1

Author(s):

Xiaodong Cui ◽

Vaibhava Goel

Keyword(s):

Neural Networks ◽

Maximum Likelihood ◽

Deep Neural Networks ◽

Nonlinear Transformations

Download Full-text

Hidden Neural Networks

Neural Computation ◽

10.1162/089976699300016764 ◽

1999 ◽

Vol 11 (2) ◽

pp. 541-563 ◽

Cited By ~ 33

Author(s):

Anders Krogh ◽

Søren Kamaric Riis

Keyword(s):

Neural Networks ◽

Maximum Likelihood ◽

Graphical Model ◽

Markov Models ◽

Compact Representation ◽

Conditional Maximum Likelihood ◽

The Neural Networks ◽

Maximum Likelihood Criterion ◽

Performance Gains ◽

Conditional Maximum

A general framework for hybrids of hidden Markov models (HMMs) and neural networks (NNs) called hidden neural networks (HNNs) is described. The article begins by reviewing standard HMMs and estimation by conditional maximum likelihood, which is used by the HNN. In the HNN, the usual HMM probability parameters are replaced by the outputs of state-specific neural networks. As opposed to many other hybrids, the HNN is normalized globally and therefore has a valid probabilistic interpretation. All parameters in the HNN are estimated simultaneously according to the discriminative conditional maximum likelihood criterion. The HNN can be viewed as an undirected probabilistic independence network (a graphical model), where the neural networks provide a compact representation of the clique functions. An evaluation of the HNN on the task of recognizing broad phoneme classes in the TIMIT database shows clear performance gains compared to standard HMMs tested on the same task.

Download Full-text

Separation of malignant and benign masses using maximum-likelihood modeling and neural networks

10.1117/12.467216 ◽

2002 ◽

Cited By ~ 3

Author(s):

Lisa M. Kinnard ◽

Shih-Chung B. Lo ◽

Paul C. Wang ◽

Matthew T. Freedman ◽

Mohammed F. Chouikha

Keyword(s):

Neural Networks ◽

Maximum Likelihood

Download Full-text

Improved supervised learning methods for EoR parameters reconstruction

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stz2429 ◽

2019 ◽

Vol 490 (1) ◽

pp. 371-384 ◽

Cited By ~ 3

Author(s):

Aristide Doussot ◽

Evan Eames ◽

Benoit Semelin

Keyword(s):

Neural Network ◽

Neural Networks ◽

Bayesian Inference ◽

Maximum Likelihood ◽

Supervised Learning ◽

Confidence Level ◽

Thermal Noise ◽

Model Parameters ◽

Learning Methods ◽

Parameter Values

ABSTRACT Within the next few years, the Square Kilometre Array (SKA) or one of its pathfinders will hopefully detect the 21-cm signal fluctuations from the Epoch of Reionization (EoR). Then, the goal will be to accurately constrain the underlying astrophysical parameters. Currently, this is mainly done with Bayesian inference. Recently, neural networks have been trained to perform inverse modelling and, ideally, predict the maximum-likelihood values of the model parameters. We build on these by improving the accuracy of the predictions using several supervised learning methods: neural networks, kernel regressions, or ridge regressions. Based on a large training set of 21-cm power spectra, we compare the performances of these methods. When using a noise-free signal generated by the model itself as input, we improve on previous neural network accuracy by one order of magnitude and, using a local ridge kernel regression, we gain another factor of a few. We then reach an accuracy level on the reconstruction of the maximum-likelihood parameter values of a few per cents compared the 1σ confidence level due to SKA thermal noise (as estimated with Bayesian inference). For an input signal affected by an SKA-like thermal noise but constrained to yield the same maximum-likelihood parameter values as the noise-free signal, our neural network exhibits an error within half of the 1σ confidence level due to the SKA thermal noise. This accuracy improves to 10$\, {\rm per\, cent}$ of the 1σ level when using the local ridge kernel. We are thus reaching a performance level where supervised learning methods are a viable alternative to determine the maximum-likelihood parameters values.

Download Full-text

Deterministic Learning for Maximum-Likelihood Estimation Through Neural Networks

IEEE Transactions on Neural Networks ◽

10.1109/tnn.2008.2000577 ◽

2008 ◽

Vol 19 (8) ◽

pp. 1456-1467 ◽

Cited By ~ 7

Author(s):

C. Cervellera ◽

D. Maccio ◽

M. Muselli

Keyword(s):

Neural Networks ◽

Maximum Likelihood ◽

Maximum Likelihood Estimation ◽

Likelihood Estimation ◽

Deterministic Learning

Download Full-text

Multilogistic Regression by Product Units

Encyclopedia of Artificial Intelligence ◽

10.4018/978-1-59904-849-9.ch166 ◽

2011 ◽

pp. 1136-1144

Author(s):

P. A. Gutiérrez ◽

C. Hervás ◽

F. J. Martínez-Estudillo ◽

M. Carbonero

Keyword(s):

Neural Networks ◽

Logistic Regression ◽

Maximum Likelihood ◽

Extensive Study ◽

Optimal Number ◽

Basis Functions ◽

Neural Learning ◽

Newton Raphson ◽

Input Variables ◽

Raphson Algorithm

Multi-class pattern recognition has a wide range of applications including handwritten digit recognition (Chiang, 1998), speech tagging and recognition (Athanaselis, Bakamidis, Dologlou, Cowie, Douglas-Cowie & Cox, 2005), bioinformatics (Mahony, Benos, Smith & Golden, 2006) and text categorization (Massey, 2003). This chapter presents a comprehensive and competitive study in multi-class neural learning which combines different elements, such as multilogistic regression, neural networks and evolutionary algorithms. The Logistic Regression model (LR) has been widely used in statistics for many years and has recently been the object of extensive study in the machine learning community. Although logistic regression is a simple and useful procedure, it poses problems when is applied to a real-problem of classification, where frequently we cannot make the stringent assumption of additive and purely linear effects of the covariates. A technique to overcome these difficulties is to augment/replace the input vector with new variables, basis functions, which are transformations of the input variables, and then to use linear models in this new space of derived input features. Methods like sigmoidal feed-forward neural networks (Bishop, 1995), generalized additive models (Hastie & Tibshirani, 1990), and PolyMARS (Kooperberg, Bose & Stone, 1997), which is a hybrid of Multivariate Adaptive Regression Splines (MARS) (Friedman, 1991) specifically designed to handle classification problems, can all be seen as different nonlinear basis function models. The major drawback of these approaches is stating the typology and the optimal number of the corresponding basis functions. Logistic regression models are usually fit by maximum likelihood, where the Newton-Raphson algorithm is the traditional way to estimate the maximum likelihood a-posteriori parameters. Typically, the algorithm converges, since the log-likelihood is concave. It is important to point out that the computation of the Newton-Raphson algorithm becomes prohibitive when the number of variables is large. Product Unit Neural Networks, PUNN, introduced by Durbin and Rumelhart (Durbin & Rumelhart, 1989), are an alternative to standard sigmoidal neural networks and are based on multiplicative nodes instead of additive ones.

Download Full-text

Training of Neural Networks, which have HRV input, using Back Propagation algorithm and Maximum Likelihood method

World Congress on Medical Physics and Biomedical Engineering 2006 - IFMBE Proceedings ◽

10.1007/978-3-540-36841-0_298 ◽

2007 ◽

pp. 1233-1237

Author(s):

V. K. Hanumantha Rao Talari ◽

Gurubatham Ravindran

Keyword(s):

Neural Networks ◽

Maximum Likelihood ◽

Maximum Likelihood Method ◽

Back Propagation ◽

Likelihood Method ◽

Back Propagation Algorithm ◽

Propagation Algorithm

Download Full-text

Resolve Overlapping Voltammetric Peaks by Artificial Neural Networks with Maximum Likelihood Principal Component Analysis

2008 Congress on Image and Signal Processing ◽

10.1109/cisp.2008.220 ◽

2008 ◽

Author(s):

Ling Gao ◽

Xiaoping Li ◽

Shouxin Ren

Keyword(s):

Neural Networks ◽

Principal Component Analysis ◽

Artificial Neural Networks ◽

Maximum Likelihood ◽

Principal Component ◽

Component Analysis ◽

Artificial Neural

Download Full-text