scholarly journals An Adaptive Stochastic Gradient-Free Approach for High-Dimensional Blackbox Optimization

2021 ◽  
pp. 333-348
Author(s):  
Anton Dereventsov ◽  
Clayton G. Webster ◽  
Joseph Daws
Author(s):  
Sibylle Hess ◽  
Gianvito Pio ◽  
Michiel Hochstenbach ◽  
Michelangelo Ceci

AbstractMatrix tri-factorization subject to binary constraints is a versatile and powerful framework for the simultaneous clustering of observations and features, also known as biclustering. Applications for biclustering encompass the clustering of high-dimensional data and explorative data mining, where the selection of the most important features is relevant. Unfortunately, due to the lack of suitable methods for the optimization subject to binary constraints, the powerful framework of biclustering is typically constrained to clusterings which partition the set of observations or features. As a result, overlap between clusters cannot be modelled and every item, even outliers in the data, have to be assigned to exactly one cluster. In this paper we propose Broccoli, an optimization scheme for matrix factorization subject to binary constraints, which is based on the theoretically well-founded optimization scheme of proximal stochastic gradient descent. Thereby, we do not impose any restrictions on the obtained clusters. Our experimental evaluation, performed on both synthetic and real-world data, and against 6 competitor algorithms, show reliable and competitive performance, even in presence of a high amount of noise in the data. Moreover, a qualitative analysis of the identified clusters shows that Broccoli may provide meaningful and interpretable clustering structures.


2019 ◽  
Vol 23 ◽  
pp. 841-873 ◽  
Author(s):  
Antoine Godichon-Baggioni

An usual problem in statistics consists in estimating the minimizer of a convex function. When we have to deal with large samples taking values in high dimensional spaces, stochastic gradient algorithms and their averaged versions are efficient candidates. Indeed, (1) they do not need too much computational efforts, (2) they do not need to store all the data, which is crucial when we deal with big data, (3) they allow to simply update the estimates, which is important when data arrive sequentially. The aim of this work is to give asymptotic and non asymptotic rates of convergence of stochastic gradient estimates as well as of their averaged versions when the function we would like to minimize is only locally strongly convex.


2021 ◽  
Vol 2021 (12) ◽  
pp. 124008
Author(s):  
Francesca Mignacco ◽  
Florent Krzakala ◽  
Pierfrancesco Urbani ◽  
Lenka Zdeborová

Abstract We analyze in a closed form the learning dynamics of the stochastic gradient descent (SGD) for a single-layer neural network classifying a high-dimensional Gaussian mixture where each cluster is assigned one of two labels. This problem provides a prototype of a non-convex loss landscape with interpolating regimes and a large generalization gap. We define a particular stochastic process for which SGD can be extended to a continuous-time limit that we call stochastic gradient flow. In the full-batch limit, we recover the standard gradient flow. We apply dynamical mean-field theory from statistical physics to track the dynamics of the algorithm in the high-dimensional limit via a self-consistent stochastic process. We explore the performance of the algorithm as a function of the control parameters shedding light on how it navigates the loss landscape.


Sign in / Sign up

Export Citation Format

Share Document