A STUDY OF EARLY STOPPING AND MODEL SELECTION  APPLIED TO THE PAPERMAKING INDUSTRY

PETER J. EDWARDS; ALAN F. MURRAY

doi:10.1142/s012906570000003x

A STUDY OF EARLY STOPPING AND MODEL SELECTION APPLIED TO THE PAPERMAKING INDUSTRY

International Journal of Neural Systems ◽

10.1142/s012906570000003x ◽

2000 ◽

Vol 10 (01) ◽

pp. 9-18 ◽

Cited By ~ 4

Author(s):

PETER J. EDWARDS ◽

ALAN F. MURRAY

Keyword(s):

Neural Network ◽

Model Selection ◽

Cross Validation ◽

Model Development ◽

Network Models ◽

Comparison Study ◽

Early Stopping ◽

Neural Network Models ◽

Model Evidence ◽

Fold Cross Validation

This paper addresses the issues of neural network model development and maintenance in the context of a complex task taken from the papermaking industry. In particular, it describes a comparison study of early stopping techniques and model selection, both to optimise neural network models for generalisation performance. The results presented here show that early stopping via use of a Bayesian model evidence measure is a viable way of optimising performance while also making maximum use of all the data. In addition, they show that ten-fold cross-validation performs well as a model selector and as an estimator of prediction accuracy. These results are important in that they show how neural network models may be optimally trained and selected for highly complex industrial tasks where the data are noisy and limited in number.

Get full-text (via PubEx)

Greed Is Good: Rapid Hyperparameter Optimization and Model Selection Using Greedy k-Fold Cross Validation

Electronics ◽

10.3390/electronics10161973 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1973

Author(s):

Daniel S. Soper

Keyword(s):

Machine Learning ◽

Model Selection ◽

Cross Validation ◽

Search Time ◽

Early Stopping ◽

Hyperparameter Optimization ◽

Validation Method ◽

Computational Budget ◽

Real World Datasets ◽

Fold Cross Validation

Selecting a final machine learning (ML) model typically occurs after a process of hyperparameter optimization in which many candidate models with varying structural properties and algorithmic settings are evaluated and compared. Evaluating each candidate model commonly relies on k-fold cross validation, wherein the data are randomly subdivided into k folds, with each fold being iteratively used as a validation set for a model that has been trained using the remaining folds. While many research studies have sought to accelerate ML model selection by applying metaheuristic and other search methods to the hyperparameter space, no consideration has been given to the k-fold cross validation process itself as a means of rapidly identifying the best-performing model. The current study rectifies this oversight by introducing a greedy k-fold cross validation method and demonstrating that greedy k-fold cross validation can vastly reduce the average time required to identify the best-performing model when given a fixed computational budget and a set of candidate models. This improved search time is shown to hold across a variety of ML algorithms and real-world datasets. For scenarios without a computational budget, this paper also introduces an early stopping algorithm based on the greedy cross validation method. The greedy early stopping method is shown to outperform a competing, state-of-the-art early stopping method both in terms of search time and the quality of the ML models selected by the algorithm. Since hyperparameter optimization is among the most time-consuming, computationally intensive, and monetarily expensive tasks in the broader process of developing ML-based solutions, the ability to rapidly identify optimal machine learning models using greedy cross validation has obvious and substantial benefits to organizations and researchers alike.

Get full-text (via PubEx)

Artificial Neural Network Modelling of Heat Transfer to Canned Particulate Fluids under Axial Rotation Processing

International Journal of Food Engineering ◽

10.2202/1556-3758.1425 ◽

2010 ◽

Vol 6 (3) ◽

Cited By ~ 2

Author(s):

Mritunjay Dwivedi ◽

Hosahalli S. Ramaswamy

Keyword(s):

Neural Network ◽

Heat Transfer ◽

Artificial Neural Network ◽

Cross Validation ◽

Network Models ◽

Neural Network Models ◽

Axial Mode ◽

Ann Models ◽

Hidden Layer ◽

Dimensionless Correlations

Artificial neural network models were developed for the overall heat transfer coefficient (U) and the fluid to particle heat transfer coefficient hfp in canned Newtonian fluids with and without particles, and the model performances were compared with the dimensionless correlations for both free and fixed axial modes of agitation. Part of the experimental data were used for training and testing, and a portion was used for cross validation. The average errors (RMS), associated with predicted hfp and U values in fixed and free axial mode were a function of the ANN variables: number of hidden layers, number of neurons in each hidden layer, learning rule, transfer function and number of learning runs. RMS values not significantly different with number of hidden layers between one and three, and the associated RMS was minimal with a high R2 value with one hidden layer and 8 neurons. The combination of the Delta-rule and TanH transfer function also gave the lowest RMS and the highest R2. The highest R2 was achieved for the data set with 85% used for training and testing and 15 % for the cross validation in both modes of rotation, and therefore this combination was used for the development of neural network models. Mean relative errors (MRE) for ANN models were much lower compared with MRE associated with dimensionless correlations; 75-78% lower for hfp and 66% lower for U in fixed and free axial mode with particulate in liquid. Without particulates, in comparison with dimensionless correlations, the MRE for ANN models were 37% lower in end-over-end mode and 76% lower for free axial mode. Overall, ANN models yielded much higher R2 values than dimensionless correlations. The ANN coefficient matrix is included so that the models can be implemented in a spreadsheet.

Get full-text (via PubEx)

Forecasting Macroeconomic Variables Using Neural Network Models and Three Automated Model Selection Techniques

Econometric Reviews ◽

10.1080/07474938.2015.1035163 ◽

2015 ◽

Vol 35 (8-10) ◽

pp. 1753-1779 ◽

Cited By ~ 5

Author(s):

Anders Bredahl Kock ◽

Timo Teräsvirta

Keyword(s):

Neural Network ◽

Model Selection ◽

Network Models ◽

Neural Network Models ◽

Macroeconomic Variables

Get full-text (via PubEx)

An Adjusted Network Information Criterion for Model Selection in Statistical Neural Network Models

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1478003040 ◽

2016 ◽

Vol 15 (2) ◽

pp. 411-427

Author(s):

Christopher Godwin Udomboso ◽

Godwin Nwazu Amahia ◽

Isaac Kwame Dontwi

Keyword(s):

Neural Network ◽

Model Selection ◽

Network Models ◽

Information Criterion ◽

Neural Network Models ◽

Network Information

Get full-text (via PubEx)

Neural Network Models for Conditional Distribution Under Bayesian Analysis

Neural Computation ◽

10.1162/neco.2007.3182 ◽

2008 ◽

Vol 20 (2) ◽

pp. 504-522 ◽

Cited By ~ 4

Author(s):

Tatiana Miazhynskaia ◽

Sylvia Frühwirth-Schnatter ◽

Georg Dorffner

Keyword(s):

Neural Network ◽

Neural Networks ◽

Monte Carlo ◽

Markov Chain ◽

Conditional Distribution ◽

Strong Support ◽

Network Models ◽

Conditional Density ◽

Neural Network Models ◽

Model Evidence

We use neural networks (NN) as a tool for a nonlinear autoregression to predict the second moment of the conditional density of return series. The NN models are compared to the popular econometric GARCH(1,1) model. We estimate the models in a Bayesian framework using Markov chain Monte Carlo posterior simulations. The interlinked aspects of the proposed Bayesian methodology are identification of NN hidden units and treatment of NN complexity based on model evidence. The empirical study includes the application of the designed strategy to market data, where we found a strong support for a nonlinear multilayer perceptron model with two hidden units.

Get full-text (via PubEx)

A comparison study between MLP and convolutional neural network models for character recognition

10.1117/12.2262589 ◽

2017 ◽

Cited By ~ 4

Author(s):

S. Ben Driss ◽

M. Soua ◽

R. Kachouri ◽

M. Akil

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Character Recognition ◽

Network Models ◽

Comparison Study ◽

Neural Network Models

Get full-text (via PubEx)

Method of complex copper-zinc ore typification using neural network models

MINING INFORMATIONAL AND ANALYTICAL BULLETIN ◽

10.25018/0236-1493-2020-5-0-140-147 ◽

2020 ◽

Vol 5 ◽

pp. 140-147 ◽

Cited By ~ 1

Author(s):

T.N. Aleksandrova ◽

◽

E.K. Ushakov ◽

A.V. Orlova ◽

◽

...

Keyword(s):

Neural Network ◽

Network Models ◽

Neural Network Models ◽

Copper Zinc ◽

Complex Copper

Get full-text (via PubEx)

Digital twin of equipment as a basis for the consumer in digital production

Automation. Modern Techologies ◽

10.36652/0869-4931-2020-74-9-394-402 ◽

2020 ◽

Keyword(s):

Neural Network ◽

Tool Wear ◽

Chip Formation ◽

Network Models ◽

Machining Accuracy ◽

Neural Network Models ◽

Digital Twin ◽

The Neural Network ◽

Digital Production ◽

Cyberphysical System

The neural network models series used in the development of an aggregated digital twin of equipment as a cyber-physical system are presented. The twins of machining accuracy, chip formation and tool wear are examined in detail. On their basis, systems for stabilization of the chip formation process during cutting and diagnose of the cutting too wear are developed. Keywords cyberphysical system; neural network model of equipment; big data, digital twin of the chip formation; digital twin of the tool wear; digital twin of nanostructured coating choice

Get full-text (via PubEx)

Universal approximation with error bounds for dynamic artificial neural network models: A tutorial and some new results

2011 IEEE International Symposium on Computer-Aided Control System Design (CACSD) ◽

10.1109/cacsd.2011.6044542 ◽

2011 ◽

Cited By ~ 4

Author(s):

Kwang Ki Kevin Kim ◽

Ernesto Rios Patron ◽

Richard D. Braatz

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Error Bounds ◽

Network Models ◽

Universal Approximation ◽

Neural Network Models ◽

Artificial Neural ◽

Artificial Neural Network Models

Get full-text (via PubEx)

Bridging the Analytical and Artificial Neural Network Models for Keyhole Formation with Experimental Verification in Laser-melting Deposition: A Novel Approach

Results in Physics ◽

10.1016/j.rinp.2021.104440 ◽

2021 ◽

pp. 104440

Author(s):

Muhammad Arif Mahmood ◽

Andrei C. Popescu ◽

Mihai Oane ◽

Asma Channa ◽

Sabin Mihai ◽

...

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Experimental Verification ◽

Network Models ◽

Laser Melting ◽

Neural Network Models ◽

Novel Approach ◽

Laser Melting Deposition ◽

Artificial Neural ◽

Artificial Neural Network Models

Get full-text (via PubEx)