Optimal errors and phase transitions in high-dimensional generalized linear models

Generalized linear models (GLMs) are used in high-dimensional machine learning, statistics, communications, and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes, or benchmark models in neural networks. We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed. Nonrigorous predictions for the optimal errors existed for special cases of GLMs, e.g., for the perceptron, in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades-old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance and locate the associated sharp phase transitions separating learnable and nonlearnable regions. We believe that this random version of GLMs can serve as a challenging benchmark for multipurpose algorithms.

Download Full-text

Variable selection in high-dimensional double generalized linear models

Statistical Papers ◽

10.1007/s00362-012-0481-y ◽

2012 ◽

Vol 55 (2) ◽

pp. 327-347 ◽

Cited By ~ 11

Author(s):

Dengke Xu ◽

Zhongzhan Zhang ◽

Liucang Wu

Keyword(s):

Variable Selection ◽

Generalized Linear Models ◽

Linear Models ◽

High Dimensional

Download Full-text

Constraint Satisfaction by Survey Propagation

Computational Complexity and Statistical Physics ◽

10.1093/oso/9780195177374.003.0011 ◽

2005 ◽

Author(s):

Alfredo Braunstein ◽

Marc Mézard

Keyword(s):

Constraint Satisfaction ◽

Message Passing ◽

Statistical Physics ◽

Belief Propagation ◽

Solution Space ◽

Error Correcting Codes ◽

Constraint Satisfaction Problems ◽

Probabilistic Methods ◽

Survey Propagation ◽

Algorithmic Strategy

Methods and analyses from statistical physics are of use not only in studying the performance of algorithms, but also in developing efficient algorithms. Here, we consider survey propagation (SP), a new approach for solving typical instances of random constraint satisfaction problems. SP has proven successful in solving random k-satisfiability (k -SAT) and random graph q-coloring (q-COL) in the “hard SAT” region of parameter space [79, 395, 397, 412], relatively close to the SAT/UNSAT phase transition discussed in the previous chapter. In this chapter we discuss the SP equations, and suggest a theoretical framework for the method [429] that applies to a wide class of discrete constraint satisfaction problems. We propose a way of deriving the equations that sheds light on the capabilities of the algorithm, and illustrates the differences with other well-known iterative probabilistic methods. Our approach takes into account the clustered structure of the solution space described in chapter 3, and involves adding an additional “joker” value that variables can be assigned. Within clusters, a variable can be frozen to some value, meaning that the variable always takes the same value for all solutions (satisfying assignments) within the cluster. Alternatively, it can be unfrozen, meaning that it fluctuates from solution to solution within the cluster. As we will discuss, the SP equations manage to describe the fluctuations by assigning joker values to unfrozen variables. The overall algorithmic strategy is iterative and decomposable in two elementary steps. The first step is to evaluate the marginal probabilities of frozen variables using the SP message-passing procedure. The second step, or decimation step, is to use this information to fix the values of some variables and simplify the problem. The notion of message passing will be illustrated throughout the chapter by comparing it with a simpler procedure known as belief propagation (mentioned in ch. 3 in the context of error correcting codes) in which no assumptions are made about the structure of the solution space. The chapter is organized as follows. In section 2 we provide the general formalism, defining constraint satisfaction problems as well as the key concepts of factor graphs and cavities, using the concrete examples of satisfiability and graph coloring.

Download Full-text

Optimal Estimation of Genetic Relatedness in High-Dimensional Linear Models

Journal of the American Statistical Association ◽

10.1080/01621459.2017.1407774 ◽

2018 ◽

Vol 114 (525) ◽

pp. 358-369 ◽

Cited By ~ 4

Author(s):

Zijian Guo ◽

Wanjie Wang ◽

T. Tony Cai ◽

Hongzhe Li

Keyword(s):

Linear Models ◽

Genetic Relatedness ◽

Optimal Estimation ◽

High Dimensional

Download Full-text

Estimation of variance components, heritability and the ridge penalty in high-dimensional generalized linear models

Communications in Statistics - Simulation and Computation ◽

10.1080/03610918.2019.1646760 ◽

2019 ◽

pp. 1-19

Author(s):

Jurre R. Veerman ◽

Gwenaël G. R. Leday ◽

Mark A. van de Wiel

Keyword(s):

Generalized Linear Models ◽

Variance Components ◽

Linear Models ◽

High Dimensional ◽

Estimation Of Variance

Download Full-text

Generalized orthogonal components regression for high dimensional generalized linear models

Computational Statistics & Data Analysis ◽

10.1016/j.csda.2015.02.006 ◽

2015 ◽

Vol 88 ◽

pp. 119-127 ◽

Cited By ~ 1

Author(s):

Yanzhu Lin ◽

Min Zhang ◽

Dabao Zhang

Keyword(s):

Generalized Linear Models ◽

Linear Models ◽

High Dimensional ◽

Orthogonal Components ◽

Generalized Orthogonal

Download Full-text

Bayesian variable selection for high dimensional generalized linear models: Convergence rates of the fitted densities

The Annals of Statistics ◽

10.1214/009053607000000019 ◽

2007 ◽

Vol 35 (4) ◽

pp. 1487-1511 ◽

Cited By ~ 34

Author(s):

Wenxin Jiang

Keyword(s):

Variable Selection ◽

Generalized Linear Models ◽

Linear Models ◽

Convergence Rates ◽

Bayesian Variable Selection ◽

High Dimensional ◽

Selection For

Download Full-text

A weight-relaxed model averaging approach for high-dimensional generalized linear models

The Annals of Statistics ◽

10.1214/17-aos1538 ◽

2017 ◽

Vol 45 (6) ◽

pp. 2654-2679 ◽

Cited By ~ 18

Author(s):

Tomohiro Ando ◽

Ker-chau Li

Keyword(s):

Generalized Linear Models ◽

Linear Models ◽

Model Averaging ◽

High Dimensional

Download Full-text

Robustness in sparse high-dimensional linear models: Relative efficiency and robust approximate message passing

Electronic Journal of Statistics ◽

10.1214/16-ejs1212 ◽

2016 ◽

Vol 10 (2) ◽

pp. 3894-3944 ◽

Cited By ~ 1

Author(s):

Jelena Bradic

Keyword(s):

Relative Efficiency ◽

Message Passing ◽

Linear Models ◽

High Dimensional ◽

Approximate Message Passing

Download Full-text

Sequential Feature Screening for Generalized Linear Models with Sparse Ultra-High Dimensional Data

Journal of Systems Science and Complexity ◽

10.1007/s11424-020-8273-2 ◽

2020 ◽

Vol 33 (2) ◽

pp. 510-526

Author(s):

Junying Zhang ◽

Hang Wang ◽

Riquan Zhang ◽

Jiajia Zhang

Keyword(s):

Generalized Linear Models ◽

Linear Models ◽

High Dimensional Data ◽

High Dimensional ◽

Feature Screening

Download Full-text

State evolution for approximate message passing with non-separable functions

Information and Inference A Journal of the IMA ◽

10.1093/imaiai/iay021 ◽

2019 ◽

Vol 9 (1) ◽

pp. 33-79 ◽

Cited By ~ 7

Author(s):

Raphaël Berthier ◽

Andrea Montanari ◽

Phan-Minh Nguyen

Keyword(s):

Message Passing ◽

Phase Retrieval ◽

Robust Regression ◽

Empirical Work ◽

Data Matrix ◽

High Dimensional ◽

Regularity Conditions ◽

Approximate Message Passing ◽

Separable Functions ◽

State Evolution

Abstract Given a high-dimensional data matrix $\boldsymbol{A}\in{{\mathbb{R}}}^{m\times n}$, approximate message passing (AMP) algorithms construct sequences of vectors $\boldsymbol{u}^{t}\in{{\mathbb{R}}}^{n}$, ${\boldsymbol v}^{t}\in{{\mathbb{R}}}^{m}$, indexed by $t\in \{0,1,2\dots \}$ by iteratively applying $\boldsymbol{A}$ or $\boldsymbol{A}^{{\textsf T}}$ and suitable nonlinear functions, which depend on the specific application. Special instances of this approach have been developed—among other applications—for compressed sensing reconstruction, robust regression, Bayesian estimation, low-rank matrix recovery, phase retrieval and community detection in graphs. For certain classes of random matrices $\boldsymbol{A}$, AMP admits an asymptotically exact description in the high-dimensional limit $m,n\to \infty $, which goes under the name of state evolution. Earlier work established state evolution for separable nonlinearities (under certain regularity conditions). Nevertheless, empirical work demonstrated several important applications that require non-separable functions. In this paper we generalize state evolution to Lipschitz continuous non-separable nonlinearities, for Gaussian matrices $\boldsymbol{A}$. Our proof makes use of Bolthausen’s conditioning technique along with several approximation arguments. In particular, we introduce a modified algorithm (called LoAMP for Long AMP), which is of independent interest.

Download Full-text