A STUDY OF EARLY STOPPING AND MODEL SELECTION APPLIED TO THE PAPERMAKING INDUSTRY

2000 ◽  
Vol 10 (01) ◽  
pp. 9-18 ◽  
Author(s):  
PETER J. EDWARDS ◽  
ALAN F. MURRAY

This paper addresses the issues of neural network model development and maintenance in the context of a complex task taken from the papermaking industry. In particular, it describes a comparison study of early stopping techniques and model selection, both to optimise neural network models for generalisation performance. The results presented here show that early stopping via use of a Bayesian model evidence measure is a viable way of optimising performance while also making maximum use of all the data. In addition, they show that ten-fold cross-validation performs well as a model selector and as an estimator of prediction accuracy. These results are important in that they show how neural network models may be optimally trained and selected for highly complex industrial tasks where the data are noisy and limited in number.

Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1973
Author(s):  
Daniel S. Soper

Selecting a final machine learning (ML) model typically occurs after a process of hyperparameter optimization in which many candidate models with varying structural properties and algorithmic settings are evaluated and compared. Evaluating each candidate model commonly relies on k-fold cross validation, wherein the data are randomly subdivided into k folds, with each fold being iteratively used as a validation set for a model that has been trained using the remaining folds. While many research studies have sought to accelerate ML model selection by applying metaheuristic and other search methods to the hyperparameter space, no consideration has been given to the k-fold cross validation process itself as a means of rapidly identifying the best-performing model. The current study rectifies this oversight by introducing a greedy k-fold cross validation method and demonstrating that greedy k-fold cross validation can vastly reduce the average time required to identify the best-performing model when given a fixed computational budget and a set of candidate models. This improved search time is shown to hold across a variety of ML algorithms and real-world datasets. For scenarios without a computational budget, this paper also introduces an early stopping algorithm based on the greedy cross validation method. The greedy early stopping method is shown to outperform a competing, state-of-the-art early stopping method both in terms of search time and the quality of the ML models selected by the algorithm. Since hyperparameter optimization is among the most time-consuming, computationally intensive, and monetarily expensive tasks in the broader process of developing ML-based solutions, the ability to rapidly identify optimal machine learning models using greedy cross validation has obvious and substantial benefits to organizations and researchers alike.


Author(s):  
Mritunjay Dwivedi ◽  
Hosahalli S. Ramaswamy

Artificial neural network models were developed for the overall heat transfer coefficient (U) and the fluid to particle heat transfer coefficient hfp in canned Newtonian fluids with and without particles, and the model performances were compared with the dimensionless correlations for both free and fixed axial modes of agitation. Part of the experimental data were used for training and testing, and a portion was used for cross validation. The average errors (RMS), associated with predicted hfp and U values in fixed and free axial mode were a function of the ANN variables: number of hidden layers, number of neurons in each hidden layer, learning rule, transfer function and number of learning runs. RMS values not significantly different with number of hidden layers between one and three, and the associated RMS was minimal with a high R2 value with one hidden layer and 8 neurons. The combination of the Delta-rule and TanH transfer function also gave the lowest RMS and the highest R2. The highest R2 was achieved for the data set with 85% used for training and testing and 15 % for the cross validation in both modes of rotation, and therefore this combination was used for the development of neural network models. Mean relative errors (MRE) for ANN models were much lower compared with MRE associated with dimensionless correlations; 75-78% lower for hfp and 66% lower for U in fixed and free axial mode with particulate in liquid. Without particulates, in comparison with dimensionless correlations, the MRE for ANN models were 37% lower in end-over-end mode and 76% lower for free axial mode. Overall, ANN models yielded much higher R2 values than dimensionless correlations. The ANN coefficient matrix is included so that the models can be implemented in a spreadsheet.


2008 ◽  
Vol 20 (2) ◽  
pp. 504-522 ◽  
Author(s):  
Tatiana Miazhynskaia ◽  
Sylvia Frühwirth-Schnatter ◽  
Georg Dorffner

We use neural networks (NN) as a tool for a nonlinear autoregression to predict the second moment of the conditional density of return series. The NN models are compared to the popular econometric GARCH(1,1) model. We estimate the models in a Bayesian framework using Markov chain Monte Carlo posterior simulations. The interlinked aspects of the proposed Bayesian methodology are identification of NN hidden units and treatment of NN complexity based on model evidence. The empirical study includes the application of the designed strategy to market data, where we found a strong support for a nonlinear multilayer perceptron model with two hidden units.


2020 ◽  
Vol 5 ◽  
pp. 140-147 ◽  
Author(s):  
T.N. Aleksandrova ◽  
◽  
E.K. Ushakov ◽  
A.V. Orlova ◽  
◽  
...  

The neural network models series used in the development of an aggregated digital twin of equipment as a cyber-physical system are presented. The twins of machining accuracy, chip formation and tool wear are examined in detail. On their basis, systems for stabilization of the chip formation process during cutting and diagnose of the cutting too wear are developed. Keywords cyberphysical system; neural network model of equipment; big data, digital twin of the chip formation; digital twin of the tool wear; digital twin of nanostructured coating choice


Sign in / Sign up

Export Citation Format

Share Document