Proximal Gradient Methods for Machine Learning and Imaging

Author(s):  
Saverio Salzo ◽  
Silvia Villa
2020 ◽  
Author(s):  
Qing Tao

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.


2021 ◽  
pp. 1-26
Author(s):  
Richard C. Gerum ◽  
Achim Schilling

Up to now, modern machine learning (ML) has been based on approximating big data sets with high-dimensional functions, taking advantage of huge computational resources. We show that biologically inspired neuron models such as the leaky-integrate-and-fire (LIF) neuron provide novel and efficient ways of information processing. They can be integrated in machine learning models and are a potential target to improve ML performance. Thus, we have derived simple update rules for LIF units to numerically integrate the differential equations. We apply a surrogate gradient approach to train the LIF units via backpropagation. We demonstrate that tuning the leak term of the LIF neurons can be used to run the neurons in different operating modes, such as simple signal integrators or coincidence detectors. Furthermore, we show that the constant surrogate gradient, in combination with tuning the leak term of the LIF units, can be used to achieve the learning dynamics of more complex surrogate gradients. To prove the validity of our method, we applied it to established image data sets (the Oxford 102 flower data set, MNIST), implemented various network architectures, used several input data encodings and demonstrated that the method is suitable to achieve state-of-the-art classification performance. We provide our method as well as further surrogate gradient methods to train spiking neural networks via backpropagation as an open-source KERAS package to make it available to the neuroscience and machine learning community. To increase the interpretability of the underlying effects and thus make a small step toward opening the black box of machine learning, we provide interactive illustrations, with the possibility of systematically monitoring the effects of parameter changes on the learning characteristics.


2020 ◽  
Author(s):  
Qing Tao

The extrapolation strategy raised by Nesterov, which can accelerate the convergence rate of gradient descent methods by orders of magnitude when dealing with smooth convex objective, has led to tremendous success in training machine learning tasks. In this paper, we theoretically study its strength in the convergence of individual iterates of general non-smooth convex optimization problems, which we name \textit{individual convergence}. We prove that Nesterov's extrapolation is capable of making the individual convergence of projected gradient methods optimal for general convex problems, which is now a challenging problem in the machine learning community. In light of this consideration, a simple modification of the gradient operation suffices to achieve optimal individual convergence for strongly convex problems, which can be regarded as making an interesting step towards the open question about SGD posed by Shamir \cite{shamir2012open}. Furthermore, the derived algorithms are extended to solve regularized non-smooth learning problems in stochastic settings. {\color{blue}They can serve as an alternative to the most basic SGD especially in coping with machine learning problems, where an individual output is needed to guarantee the regularization structure while keeping an optimal rate of convergence.} Typically, our method is applicable as an efficient tool for solving large-scale $l_1$-regularized hinge-loss learning problems. Several real experiments demonstrate that the derived algorithms not only achieve optimal individual convergence rates but also guarantee better sparsity than the averaged solution.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kapil Manoharan ◽  
Mohd. Tahir Anwar ◽  
Shantanu Bhattacharya

AbstractLow energy surface coatings have found wide range of applications for generating hydrophobic and superhydrophobic surfaces. Most of the studies have been related to use of a single coating material over a single substrate or using a single technique. The degree of hydrophobicity is highly dependent on fabrication processes as well as materials being coated and as such warrants a high-level study using experimental optimization leading to the evaluation of the parametric behavior of coatings and their application techniques. Also, a single platform or system which can predict the required set of parameters for generating hydrophobic surface of required nature for given substrate is of requirement. This work applies the powerful machine learning algorithms (Levenberg Marquardt using Gauss Newton and Gradient methods) to evaluate the various processes affecting the anti-wetting behavior of coated printable paper substrates with the capability to predict the most optimized method of coating and materials that may lead to a desirable surface contact angle. The major application techniques used for this study pertain to dip coating, spray coating, spin coating and inkjet printing and silane and sol–gel base coating materials.


Author(s):  
Ki-Young Kwon ◽  
◽  
Keun-Woo Jung ◽  
Dong-Su Yang ◽  
Jooyoung Park ◽  
...  

Recently, reinforcement learning and evolution strategy have become major tools in the field of machine learning, and have shown excellent performance in various engineering problems. In particular, the Natural Actor-Critic (NAC) approach and the Natural Evolution Strategies (NES) have led to considerable interests in the area of natural-gradient-based machine learning methods with many successful applications. In this paper, we apply the NAC and the NES to pathtracking control problems for autonomous vehicles. Simulation results show that these methods can yield better performance compared to the conventional PID controllers.


Author(s):  
Zhijian Luo ◽  
Yuntao Qian

Stochastic optimization on large-scale machine learning problems has been developed dramatically since stochastic gradient methods with variance reduction technique were introduced. Several stochastic second-order methods, which approximate curvature information by the Hessian in stochastic setting, have been proposed for improvements. In this paper, we introduce a Stochastic Sub-Sampled Newton method with Variance Reduction (S2NMVR), which incorporates the sub-sampled Newton method and stochastic variance-reduced gradient. For many machine learning problems, the linear time Hessian-vector production provides evidence to the computational efficiency of S2NMVR. We then develop two variations of S2NMVR that preserve the estimation of Hessian inverse and decrease the computational cost of Hessian-vector product for nonlinear problems.


2020 ◽  
Vol 43 ◽  
Author(s):  
Myrthe Faber

Abstract Gilead et al. state that abstraction supports mental travel, and that mental travel critically relies on abstraction. I propose an important addition to this theoretical framework, namely that mental travel might also support abstraction. Specifically, I argue that spontaneous mental travel (mind wandering), much like data augmentation in machine learning, provides variability in mental content and context necessary for abstraction.


Sign in / Sign up

Export Citation Format

Share Document