scholarly journals Generalized orthogonal components regression for high dimensional generalized linear models

2015 ◽  
Vol 88 ◽  
pp. 119-127 ◽  
Author(s):  
Yanzhu Lin ◽  
Min Zhang ◽  
Dabao Zhang
2012 ◽  
Vol 55 (2) ◽  
pp. 327-347 ◽  
Author(s):  
Dengke Xu ◽  
Zhongzhan Zhang ◽  
Liucang Wu

2019 ◽  
Vol 116 (12) ◽  
pp. 5451-5460 ◽  
Author(s):  
Jean Barbier ◽  
Florent Krzakala ◽  
Nicolas Macris ◽  
Léo Miolane ◽  
Lenka Zdeborová

Generalized linear models (GLMs) are used in high-dimensional machine learning, statistics, communications, and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes, or benchmark models in neural networks. We evaluate the mutual information (or “free entropy”) from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed. Nonrigorous predictions for the optimal errors existed for special cases of GLMs, e.g., for the perceptron, in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades-old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance and locate the associated sharp phase transitions separating learnable and nonlearnable regions. We believe that this random version of GLMs can serve as a challenging benchmark for multipurpose algorithms.


Biometrika ◽  
2021 ◽  
Author(s):  
Emre Demirkaya ◽  
Yang Feng ◽  
Pallavi Basu ◽  
Jinchi Lv

Summary Model selection is crucial both to high-dimensional learning and to inference for contemporary big data applications in pinpointing the best set of covariates among a sequence of candidate interpretable models. Most existing work assumes implicitly that the models are correctly specified or have fixed dimensionality, yet both are prevalent in practice. In this paper, we exploit the framework of model selection principles under the misspecified generalized linear models presented in Lv and Liu (2014) and investigate the asymptotic expansion of the posterior model probability in the setting of high-dimensional misspecified models.With a natural choice of prior probabilities that encourages interpretability and incorporates the Kullback–Leibler divergence, we suggest the high-dimensional generalized Bayesian information criterion with prior probability for large-scale model selection with misspecification. Our new information criterion characterizes the impacts of both model misspecification and high dimensionality on model selection. We further establish the consistency of covariance contrast matrix estimation and the model selection consistency of the new information criterion in ultra-high dimensions under some mild regularity conditions. The numerical studies demonstrate that our new method enjoys improved model selection consistency compared to its main competitors.


2017 ◽  
Vol 60 (5) ◽  
pp. 1469-1486 ◽  
Author(s):  
Mingqiu Wang ◽  
Guo-Liang Tian

Sign in / Sign up

Export Citation Format

Share Document