generalisation ability
Recently Published Documents


TOTAL DOCUMENTS

17
(FIVE YEARS 2)

H-INDEX

4
(FIVE YEARS 0)

2021 ◽  
Author(s):  
◽  
Qi Chen

<p>Symbolic regression (SR) is a function identification process, the task of which is to identify and express the relationship between the input and output variables in mathematical models. SR is named to emphasise its ability to find the structure and coefficients of the model simultaneously. Genetic Programming (GP) is an attractive and powerful technique for SR, since it does not require any predefined model and has a flexible representation. However, GP based SR generally has a poor generalisation ability which degrades its reliability and hampers its applications to science and real-world modeling. Therefore, this thesis aims to develop new GP approaches to SR that evolve/learn models exhibiting good generalisation ability.  This thesis develops a novel feature selection method in GP for high-dimensional SR. Feature selection can potentially contribute not only to improving the efficiency of learning algorithms but also to enhancing the generalisation ability. However, feature selection is seldom considered in GP for high-dimensional SR. The proposed new feature selection method utilises GP’s built-in feature selection ability and relies on permutation to detect the truly relevant features and discard irrelevant/noisy features. The results confirm the superiority of the proposed method over the other examined feature selection methods including random forests and decision trees on identifying the truly relevant features. Further analysis indicates that the models evolved by GP with the proposed feature selection method are more likely to contain only the truly relevant features and have better interpretability.  To address the overfitting issue of GP when learning from a relatively small number of instances, this thesis proposes a new GP approach by incorporating structural risk minimisation (SRM), which is a framework to estimate the generalisation performance of models, into GP. The effectiveness of SRM highly depends on the accuracy of the Vapnik-Chervonenkis (VC) dimension measuring model complexity. This thesis significantly extends an experimental method (instead of theoretical estimation) to measure the VC-dimension of a mixture of linear and nonlinear regression models in GP for the first time. The experimental method has been conducted using uniform and non-uniform settings and provides reliable VC-dimension values. The results show that our methods have an impressively better generalisation gain and evolve more compact model, which have a much smaller behavioural difference from the target models than standard GP and GP with bootstrap, The proposed method using the optimised non-uniform setting further improves the one using the uniform setting.  This thesis employs geometric semantic GP (GSGP) to tackle the unsatisfied generalisation performance of GP for SR when no overfitting occurs. It proposes three new angle-awareness driven geometric semantic operators (GSO) including selection, crossover and mutation to further explore the geometry of the semantic space to gain a greater generalisation improvement in GP for SR. The angle-awareness brings new geometric properties to these geometric operators, which are expected to provide a greater leverage for approximating the target semantics in each operation, and more importantly, to be resistant to overfitting. The results show that compared with two kinds of state-of-the-art GSOs, the proposed new GSOs not only drive the evolutionary process fitting the target semantics more efficiently but also significantly improve the generalisation performance. A further comparison on the evolved models shows that the new method generally produces simpler models with a much smaller size and containing important building blocks of the target models.</p>



2021 ◽  
Author(s):  
◽  
Qi Chen

<p>Symbolic regression (SR) is a function identification process, the task of which is to identify and express the relationship between the input and output variables in mathematical models. SR is named to emphasise its ability to find the structure and coefficients of the model simultaneously. Genetic Programming (GP) is an attractive and powerful technique for SR, since it does not require any predefined model and has a flexible representation. However, GP based SR generally has a poor generalisation ability which degrades its reliability and hampers its applications to science and real-world modeling. Therefore, this thesis aims to develop new GP approaches to SR that evolve/learn models exhibiting good generalisation ability.  This thesis develops a novel feature selection method in GP for high-dimensional SR. Feature selection can potentially contribute not only to improving the efficiency of learning algorithms but also to enhancing the generalisation ability. However, feature selection is seldom considered in GP for high-dimensional SR. The proposed new feature selection method utilises GP’s built-in feature selection ability and relies on permutation to detect the truly relevant features and discard irrelevant/noisy features. The results confirm the superiority of the proposed method over the other examined feature selection methods including random forests and decision trees on identifying the truly relevant features. Further analysis indicates that the models evolved by GP with the proposed feature selection method are more likely to contain only the truly relevant features and have better interpretability.  To address the overfitting issue of GP when learning from a relatively small number of instances, this thesis proposes a new GP approach by incorporating structural risk minimisation (SRM), which is a framework to estimate the generalisation performance of models, into GP. The effectiveness of SRM highly depends on the accuracy of the Vapnik-Chervonenkis (VC) dimension measuring model complexity. This thesis significantly extends an experimental method (instead of theoretical estimation) to measure the VC-dimension of a mixture of linear and nonlinear regression models in GP for the first time. The experimental method has been conducted using uniform and non-uniform settings and provides reliable VC-dimension values. The results show that our methods have an impressively better generalisation gain and evolve more compact model, which have a much smaller behavioural difference from the target models than standard GP and GP with bootstrap, The proposed method using the optimised non-uniform setting further improves the one using the uniform setting.  This thesis employs geometric semantic GP (GSGP) to tackle the unsatisfied generalisation performance of GP for SR when no overfitting occurs. It proposes three new angle-awareness driven geometric semantic operators (GSO) including selection, crossover and mutation to further explore the geometry of the semantic space to gain a greater generalisation improvement in GP for SR. The angle-awareness brings new geometric properties to these geometric operators, which are expected to provide a greater leverage for approximating the target semantics in each operation, and more importantly, to be resistant to overfitting. The results show that compared with two kinds of state-of-the-art GSOs, the proposed new GSOs not only drive the evolutionary process fitting the target semantics more efficiently but also significantly improve the generalisation performance. A further comparison on the evolved models shows that the new method generally produces simpler models with a much smaller size and containing important building blocks of the target models.</p>



2020 ◽  
Author(s):  
Frank Imbach ◽  
Stephane Perrey ◽  
Romain Chailan ◽  
Thibaut Meline ◽  
Robin Candau

Abstract This study aims to provide a transferable methodology in the context of sport performance modelling, with a special focus to the generalisation of models. Data were collected from seven elite Short track speed skaters over a three months training period. In order to account for training load accumulation over sessions, cumulative responses to training were modelled by impulse, serial and bi-exponential responses functions. The variable dose-response (DR) model was compared to elastic net (ENET), principal component regression (PCR) and random forest (RF) models, while using cross-validation within a time-series framework. ENET, PCR and RF models were fitted either individually (MI) or on the whole group of athletes (MG). Root mean square error criterion was used to assess performances of models. ENET and PCR models provided a significant greater generalisation ability than the DR model (p = 0.012, p < 0.001, p = 0.005 and p < 0.001 for ENETI, ENETG, PCRI and PCRG, respectively). Only ENETI, ENETG and RFI were significantly more accurate in prediction than DR (p = 0.020, p < 0.001 and p = 0.043, respectively). In conclusion, ENET achieved greater generalisation and predictive accuracy performances. Thus, building and evaluating models within a generalisation enhancing procedure is a prerequisite for any predictive modelling.



2020 ◽  
Vol 32 (2) ◽  
Author(s):  
Marelie Hattingh Davel

No framework exists that can explain and predict the generalisation ability of deep neural networks in general circumstances. In fact, this question has not been answered for some of the least complicated of neural network architectures: fully-connected feedforward networks with rectified linear activations and a limited number of hidden layers. For such an architecture, we show how adding a summary layer to the network makes it more amenable to analysis, and allows us to define the conditions that are required to guarantee that a set of samples will all be classified correctly. This process does not describe the generalisation behaviour of these networks, but produces a number of metrics that are useful for probing their learning and generalisation behaviour. We support the analytical conclusions with empirical results, both to confirm that the mathematical guarantees hold in practice, and to demonstrate the use of the analysis process.



2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Giuliano Armano

AbstractUnderstanding the inner behaviour of multilayer perceptrons during and after training is a goal of paramount importance for many researchers worldwide. This article experimentally shows that relevant patterns emerge upon training, which are typically related to the underlying problem difficulty. The occurrence of these patterns is highlighted by means of $$\langle \varphi ,\delta \rangle$$ ⟨ φ , δ ⟩ diagrams, a 2D graphical tool originally devised to support the work of researchers on classifier performance evaluation and on feature assessment. The underlying assumption being that multilayer perceptrons are powerful engines for feature encoding, hidden layers have been inspected as they were in fact hosting new input features. Interestingly, there are problems that appear difficult if dealt with using a single hidden layer, whereas they turn out to be easier upon the addition of further layers. The experimental findings reported in this article give further support to the standpoint according to which implementing neural architectures with multiple layers may help to boost their generalisation ability. A generic training strategy inspired by some relevant recommendations of deep learning has also been devised. A basic implementation of this strategy has been thoroughly used during the experiments aimed at identifying relevant patterns inside multilayer perceptrons. Further experiments performed in a comparative setting have shown that it could be adopted as viable alternative to the classical backpropagation algorithm.



2020 ◽  
Vol 34 (07) ◽  
pp. 11773-11781 ◽  
Author(s):  
Karl Moritz Hermann ◽  
Mateusz Malinowski ◽  
Piotr Mirowski ◽  
Andras Banki-Horvath ◽  
Keith Anderson ◽  
...  

Navigating and understanding the real world remains a key challenge in machine learning and inspires a great variety of research in areas such as language grounding, planning, navigation and computer vision. We propose an instruction-following task that requires all of the above, and which combines the practicality of simulated environments with the challenges of ambiguous, noisy real world data. StreetNav is built on top of Google Street View and provides visually accurate environments representing real places. Agents are given driving instructions which they must learn to interpret in order to successfully navigate in this environment. Since humans equipped with driving instructions can readily navigate in previously unseen cities, we set a high bar and test our trained agents for similar cognitive capabilities. Although deep reinforcement learning (RL) methods are frequently evaluated only on data that closely follow the training distribution, our dataset extends to multiple cities and has a clean train/test separation. This allows for thorough testing of generalisation ability. This paper presents the StreetNav environment and tasks, models that establish strong baselines, and extensive analysis of the task and the trained agents.



2019 ◽  
Vol 20 (9) ◽  
pp. 2120
Author(s):  
Angelo A. D’Archivio ◽  
Andrea Giannitto

Retention in gas–liquid chromatography is mainly governed by the extent of intermolecular interactions between the solute and the stationary phase. While molecular descriptors of computational origin are commonly used to encode the effect of the solute structure in quantitative structure–retention relationship (QSRR) approaches, characterisation of stationary phases is historically based on empirical scales, the McReynolds system of phase constants being one of the most popular. In this work, poly(siloxane) stationary phases, which occupy a dominant position in modern gas–liquid chromatography, were characterised by theoretical molecular descriptors. With this aim, the first five McReynolds constants of 29 columns were modelled by multilinear regression (MLR) coupled with genetic algorithm (GA) variable selection applied to the molecular descriptors provided by software Dragon. The generalisation ability of the established GA-MLR models, evaluated by both external prediction and repeated calibration/evaluation splitting, was better than that reported in analogous studies regarding nonpolymeric (molecular) stationary phases. Principal component analysis on the significant molecular descriptors allowed to classify the poly(siloxanes) according to their chemical composition and partitioning properties. Development of QSRR-based models combining molecular descriptors of both solutes and stationary phases, which will be applied to transfer retention data among different columns, is in progress.



2017 ◽  
Author(s):  
Ilia Korvigo ◽  
Andrey Afanasyev ◽  
Nikolay Romashchenko ◽  
Mihail Skoblov

AbstractMany automatic classifiers were introduced to aid inference of phenotypical effects of uncategorised nsSNVs (nonsynonymous Single Nucleotide Variations) in theoretical and medical applications. Lately, several meta-estimators have been proposed that combine different predictors, such as PolyPhen and SIFT, to integrate more information in a single score. Although many advances have been made in feature design and machine learning algorithms used, the shortage of high-quality reference data along with the bias towards intensively studied in vitro models call for improved generalisation ability in order to further increase classification accuracy and handle records with insufficient data. Since a meta-estimator basically combines different scoring systems with highly complicated nonlinear relationships, we investigated how deep learning (supervised and unsupervised), which is particularly efficient at discovering hierarchies of features, can improve classification performance. While it is believed that one should only use deep learning for high-dimensional input spaces and other models (logistic regression, support vector machines, Bayesian classifiers, etc) for simpler inputs, we still believe that the ability of neural networks to discover intricate structure in highly heterogenous datasets can aid a meta-estimator. We compare the performance with various popular predictors, many of which are recommended by the American College of Medical Genetics and Genomics (ACMG), as well as available deep learning-based predictors. Thanks to hardware acceleration we were able to use a computationally expensive genetic algorithm to stochastically optimise hyper-parameters over many generations. Overfitting was hindered by noise injection and dropout, limiting coadaptation of hidden units. Although we stress that this work was not conceived as a tool comparison, but rather an exploration of the possibilities of deep learning application in ensemble scores, our results show that even relatively simple modern neural networks can significantly improve both prediction accuracy and coverage. We provide open-access to our finest model at http://score.generesearch.ru.



Author(s):  
Jarosław Gocławski ◽  
Joanna Sekulska-Nalewajko ◽  
Elżbieta Kuźniak

Abstract The increased production of Reactive Oxygen Species (ROS) in plant leaf tissues is a hallmark of a plant’s reaction to various environmental stresses. This paper describes an automatic segmentation method for scanned images of cucurbits leaves stained to visualise ROS accumulation sites featured by specific colour hues and intensities. The leaves placed separately in the scanner view field on a colour background are extracted by thresholding in the RGB colour space, then cleaned from petioles to obtain a leaf blade mask. The second stage of the method consists in the classification of within mask pixels in a hue-saturation plane using two classes, determined by leaf regions with and without colour products of the ROS reaction. At this stage a two-layer, hybrid artificial neural network is applied with the first layer as a self-organising Kohonen type network and a linear perceptron output layer (counter propagation network type). The WTA-based, fast competitive learning of the first layer was improved to increase clustering reliability. Widrow–Hoff supervised training used at the output layer utilises manually labelled patterns prepared from training images. The generalisation ability of the network model has been verified by K-fold cross-validation. The method significantly accelerates the measurement of leaf regions containing the ROS reaction colour products and improves measurement accuracy.



Author(s):  
Nguyen Quang Uy ◽  
Nguyen Thi Hien ◽  
Nguyen Xuan Hoai ◽  
Michael O’Neill


Sign in / Sign up

Export Citation Format

Share Document