scholarly journals Fully corrective gradient boosting with squared hinge: Fast learning rates and early stopping

2021 ◽  
Author(s):  
Jinshan Zeng ◽  
Min Zhang ◽  
Shao-Bo Lin
2021 ◽  
Author(s):  
Ryan Santoso ◽  
Xupeng He ◽  
Marwa Alsinan ◽  
Hyung Kwak ◽  
Hussein Hoteit

Abstract Automatic fracture recognition from borehole images or outcrops is applicable for the construction of fractured reservoir models. Deep learning for fracture recognition is subject to uncertainty due to sparse and imbalanced training set, and random initialization. We present a new workflow to optimize a deep learning model under uncertainty using U-Net. We consider both epistemic and aleatoric uncertainty of the model. We propose a U-Net architecture by inserting dropout layer after every "weighting" layer. We vary the dropout probability to investigate its impact on the uncertainty response. We build the training set and assign uniform distribution for each training parameter, such as the number of epochs, batch size, and learning rate. We then perform uncertainty quantification by running the model multiple times for each realization, where we capture the aleatoric response. In this approach, which is based on Monte Carlo Dropout, the variance map and F1-scores are utilized to evaluate the need to craft additional augmentations or stop the process. This work demonstrates the existence of uncertainty within the deep learning caused by sparse and imbalanced training sets. This issue leads to unstable predictions. The overall responses are accommodated in the form of aleatoric uncertainty. Our workflow utilizes the uncertainty response (variance map) as a measure to craft additional augmentations in the training set. High variance in certain features denotes the need to add new augmented images containing the features, either through affine transformation (rotation, translation, and scaling) or utilizing similar images. The augmentation improves the accuracy of the prediction, reduces the variance prediction, and stabilizes the output. Architecture, number of epochs, batch size, and learning rate are optimized under a fixed-uncertain training set. We perform the optimization by searching the global maximum of accuracy after running multiple realizations. Besides the quality of the training set, the learning rate is the heavy-hitter in the optimization process. The selected learning rate controls the diffusion of information in the model. Under the imbalanced condition, fast learning rates cause the model to miss the main features. The other challenge in fracture recognition on a real outcrop is to optimally pick the parental images to generate the initial training set. We suggest picking images from multiple sides of the outcrop, which shows significant variations of the features. This technique is needed to avoid long iteration within the workflow. We introduce a new approach to address the uncertainties associated with the training process and with the physical problem. The proposed approach is general in concept and can be applied to various deep-learning problems in geoscience.


2013 ◽  
Vol 108 ◽  
pp. 13-22 ◽  
Author(s):  
Shao-Gao Lv ◽  
Tie-Feng Ma ◽  
Liu Liu ◽  
Yun-Long Feng

2007 ◽  
Vol 35 (2) ◽  
pp. 608-633 ◽  
Author(s):  
Jean-Yves Audibert ◽  
Alexandre B. Tsybakov
Keyword(s):  

2012 ◽  
Vol 42 (12) ◽  
pp. 1251-1262 ◽  
Author(s):  
HongZhi TONG ◽  
FengHong YANG ◽  
DiRong CHEN

2000 ◽  
Vol 12 (3) ◽  
pp. 519-529 ◽  
Author(s):  
Manuel A. Sánchez-Montañés ◽  
Paul F. M. J. Verschure ◽  
Peter König

Mechanisms influencing learning in neural networks are usually investigated on either a local or a global scale. The former relates to synaptic processes, the latter to unspecific modulatory systems. Here we study the interaction of a local learning rule that evaluates coincidences of pre- and postsynaptic action potentials and a global modulatory mechanism, such as the action of the basal forebrain onto cortical neurons. The simulations demonstrate that the interaction of these mechanisms leads to a learning rule supporting fast learning rates, stability, and flexibility. Furthermore, the simulations generate two experimentally testable predictions on the dependence of backpropagating action potential on basal forebrain activity and the relative timing of the activity of inhibitory and excitatory neurons in the neocortex.


Author(s):  
YONG-LI XU ◽  
DI-RONG CHEN

The study of regularized learning algorithms is a very important issue and functional data analysis extends classical methods. We establish the learning rates of the least square regularized regression algorithm in reproducing kernel Hilbert space for functional data. With the iteration method, we obtain fast learning rate for functional data. Our result is a natural extension for least square regularized regression algorithm when the dimension of input data is finite.


1992 ◽  
Vol 03 (04) ◽  
pp. 323-350 ◽  
Author(s):  
JOYDEEP GHOSH ◽  
YOAN SHIN

This paper introduces a class of higher-order networks called pi-sigma networks (PSNs). PSNs are feedforward networks with a single “hidden” layer of linear summing units and with product units in the output layer. A PSN uses these product units to indirectly incorporate the capabilities of higher-order networks while greatly reducing network complexity. PSNs have only one layer of adjustable weights and exhibit fast learning. A PSN with K summing units provides a constrained Kth order approximation of a continuous function. A generalization of the PSN is presented that can uniformly approximate any continuous function defined on a compact set. The use of linear hidden units makes it possible to mathematically study the convergence properties of various LMS type learning algorithms for PSNs. We show that it is desirable to update only a partial set of weights at a time rather than synchronously updating all the weights. Bounds for learning rates which guarantee convergence are derived. Several simulation results on pattern classification and function approximation problems highlight the capabilities of the PSN. Extensive comparisons are made with other higher order networks and with multilayered perceptrons. The neurobiological plausibility of PSN type networks is also discussed.


2014 ◽  
Vol 111 (7) ◽  
pp. 1444-1454 ◽  
Author(s):  
Firas Mawase ◽  
Lior Shmuelof ◽  
Simona Bar-Haim ◽  
Amir Karniel

Faster relearning of an external perturbation, savings, offers a behavioral linkage between motor learning and memory. To explain savings effects in reaching adaptation experiments, recent models suggested the existence of multiple learning components, each shows different learning and forgetting properties that may change following initial learning. Nevertheless, the existence of these components in rhythmic movements with other effectors, such as during locomotor adaptation, has not yet been studied. Here, we study savings in locomotor adaptation in two experiments; in the first, subjects adapted to speed perturbations during walking on a split-belt treadmill, briefly adapted to a counter-perturbation and then readapted. In a second experiment, subjects readapted after a prolonged period of washout of initial adaptation. In both experiments we find clear evidence for increased learning rates (savings) during readaptation. We show that the basic error-based multiple timescales linear state space model is not sufficient to explain savings during locomotor adaptation. Instead, we show that locomotor adaptation leads to changes in learning parameters, so that learning rates are faster during readaptation. Interestingly, we find an intersubject correlation between the slow learning component in initial adaptation and the fast learning component in the readaptation phase, suggesting an underlying mechanism for savings. Together, these findings suggest that savings in locomotion and in reaching may share common computational and neuronal mechanisms; both are driven by the slow learning component and are likely to depend on cortical plasticity.


Sign in / Sign up

Export Citation Format

Share Document