Large-scale model selection in misspecified generalized linear models

Biometrika ◽  
2021 ◽  
Author(s):  
Emre Demirkaya ◽  
Yang Feng ◽  
Pallavi Basu ◽  
Jinchi Lv

Summary Model selection is crucial both to high-dimensional learning and to inference for contemporary big data applications in pinpointing the best set of covariates among a sequence of candidate interpretable models. Most existing work assumes implicitly that the models are correctly specified or have fixed dimensionality, yet both are prevalent in practice. In this paper, we exploit the framework of model selection principles under the misspecified generalized linear models presented in Lv and Liu (2014) and investigate the asymptotic expansion of the posterior model probability in the setting of high-dimensional misspecified models.With a natural choice of prior probabilities that encourages interpretability and incorporates the Kullback–Leibler divergence, we suggest the high-dimensional generalized Bayesian information criterion with prior probability for large-scale model selection with misspecification. Our new information criterion characterizes the impacts of both model misspecification and high dimensionality on model selection. We further establish the consistency of covariance contrast matrix estimation and the model selection consistency of the new information criterion in ultra-high dimensions under some mild regularity conditions. The numerical studies demonstrate that our new method enjoys improved model selection consistency compared to its main competitors.

Entropy ◽  
2020 ◽  
Vol 22 (8) ◽  
pp. 807
Author(s):  
Xuan Cao ◽  
Kyoungjae Lee

High-dimensional variable selection is an important research topic in modern statistics. While methods using nonlocal priors have been thoroughly studied for variable selection in linear regression, the crucial high-dimensional model selection properties for nonlocal priors in generalized linear models have not been investigated. In this paper, we consider a hierarchical generalized linear regression model with the product moment nonlocal prior over coefficients and examine its properties. Under standard regularity assumptions, we establish strong model selection consistency in a high-dimensional setting, where the number of covariates is allowed to increase at a sub-exponential rate with the sample size. The Laplace approximation is implemented for computing the posterior probabilities and the shotgun stochastic search procedure is suggested for exploring the posterior space. The proposed method is validated through simulation studies and illustrated by a real data example on functional activity analysis in fMRI study for predicting Parkinson’s disease.


2006 ◽  
Vol 21 (6) ◽  
pp. 901-917 ◽  
Author(s):  
M. D. Haunschild ◽  
S. A. Wahl ◽  
B. Freisleben ◽  
W. Wiechert

2013 ◽  
Vol 14 (2) ◽  
Author(s):  
Noor Fachrizal

Biomass such as agriculture waste and urban waste are enormous potency as energy resources instead of enviromental problem. organic waste can be converted into energy in the form of liquid fuel, solid, and syngas by using of pyrolysis technique. Pyrolysis process can yield higher liquid form when the process can be drifted into fast and flash response. It can be solved by using microwave heating method. This research is started from developing an experimentation laboratory apparatus of microwave-assisted pyrolysis of biomass energy conversion system, and conducting preliminary experiments for gaining the proof that this method can be established for driving the process properly and safely. Modifying commercial oven into laboratory apparatus has been done, it works safely, and initial experiments have been carried out, process yields bio-oil and charcoal shortly, several parameters are achieved. Some further experiments are still needed for more detail parameters. Theresults may be used to design small-scale continuous model of productionsystem, which then can be developed into large-scale model that applicable for comercial use.


2021 ◽  
Vol 10 (6) ◽  
pp. 1211
Author(s):  
Li-Te Lin ◽  
Kuan-Hao Tsui

The relationship between serum dehydroepiandrosterone sulphate (DHEA-S) and anti-Mullerian hormone (AMH) levels has not been fully established. Therefore, we performed a large-scale cross-sectional study to investigate the association between serum DHEA-S and AMH levels. The study included a total of 2155 infertile women aged 20 to 46 years who were divided into four quartile groups (Q1 to Q4) based on serum DHEA-S levels. We found that there was a weak positive association between serum DHEA-S and AMH levels in infertile women (r = 0.190, p < 0.001). After adjusting for potential confounders, serum DHEA-S levels positively correlated with serum AMH levels in infertile women (β = 0.103, p < 0.001). Infertile women in the highest DHEA-S quartile category (Q4) showed significantly higher serum AMH levels (p < 0.001) compared with women in the lowest DHEA-S quartile category (Q1). The serum AMH levels significantly increased across increasing DHEA-S quartile categories in infertile women (p = 0.014) using generalized linear models after adjustment for potential confounders. Our data show that serum DHEA-S levels are positively associated with serum AMH levels.


1984 ◽  
Vol 106 (1) ◽  
pp. 222-228 ◽  
Author(s):  
M. L. Marziale ◽  
R. E. Mayle

An experimental investigation was conducted to examine the effect of a periodic variation in the angle of attack on heat transfer at the leading edge of a gas turbine blade. A circular cylinder was used as a large-scale model of the leading edge region. The cylinder was placed in a wind tunnel and was oscillated rotationally about its axis. The incident flow Reynolds number and the Strouhal number of oscillation were chosen to model an actual turbine condition. Incident turbulence levels up to 4.9 percent were produced by grids placed upstream of the cylinder. The transfer rate was measured using a mass transfer technique and heat transfer rates inferred from the results. A direct comparison of the unsteady and steady results indicate that the effect is dependent on the Strouhal number, turbulence level, and the turbulence length scale, but that the largest observed effect was only a 10 percent augmentation at the nominal stagnation position.


Sign in / Sign up

Export Citation Format

Share Document