scholarly journals Predictive Statistical Representations of Observed and Simulated Rainfall Using Generalized Linear Models

2019 ◽  
Vol 32 (11) ◽  
pp. 3409-3427
Author(s):  
Junho Yang ◽  
Mikyoung Jun ◽  
Courtney Schumacher ◽  
R. Saravanan

Abstract This study explores the feasibility of predicting subdaily variations and the climatological spatial patterns of rain in the tropical Pacific from atmospheric profiles using a set of generalized linear models: logistic regression for rain occurrence and gamma regression for rain amount. The prediction is separated into different rain types from TRMM satellite radar observations (stratiform, deep convective, and shallow convective) and CAM5 simulations (large-scale and convective). Environmental variables from MERRA-2 and CAM5 are used as predictors for TRMM and CAM5 rainfall, respectively. The statistical models are trained using environmental fields at 0000 UTC and rainfall from 0000 to 0600 UTC during 2003. The results are used to predict 2004 rain occurrence and rate for MERRA-2/TRMM and CAM5 separately. The first EOF profile of humidity and the second EOF profile of temperature contribute most to the prediction for both statistical models in each case. The logistic regression generally performs well for all rain types, but does better in the east Pacific compared to the west Pacific. The gamma regression produces reasonable geographical rain amount distributions but rain rate probability distributions are not predicted as well, suggesting the need for a different, higher-order model to predict rain rates. The results of this study suggest that statistical models applied to TRMM radar observations and MERRA-2 environmental parameters can predict the spatial patterns and amplitudes of tropical rainfall in the time-averaged sense. Comparing the observationally trained models to models that are trained using CAM5 simulations points to possible deficiencies in the convection parameterization used in CAM5.

2021 ◽  
Vol 10 (6) ◽  
pp. 1211
Author(s):  
Li-Te Lin ◽  
Kuan-Hao Tsui

The relationship between serum dehydroepiandrosterone sulphate (DHEA-S) and anti-Mullerian hormone (AMH) levels has not been fully established. Therefore, we performed a large-scale cross-sectional study to investigate the association between serum DHEA-S and AMH levels. The study included a total of 2155 infertile women aged 20 to 46 years who were divided into four quartile groups (Q1 to Q4) based on serum DHEA-S levels. We found that there was a weak positive association between serum DHEA-S and AMH levels in infertile women (r = 0.190, p < 0.001). After adjusting for potential confounders, serum DHEA-S levels positively correlated with serum AMH levels in infertile women (β = 0.103, p < 0.001). Infertile women in the highest DHEA-S quartile category (Q4) showed significantly higher serum AMH levels (p < 0.001) compared with women in the lowest DHEA-S quartile category (Q1). The serum AMH levels significantly increased across increasing DHEA-S quartile categories in infertile women (p = 0.014) using generalized linear models after adjustment for potential confounders. Our data show that serum DHEA-S levels are positively associated with serum AMH levels.


2021 ◽  
Vol 18 ◽  
pp. 163-170
Author(s):  
Lorenc Koçiu ◽  
Kledian Kodra

Using the econometric models, this paper addresses the ability of Albanian Small and Medium-sizedEnterprises (SMEs) to identify the risks they face. To write this paper, we studied SMEs operating in theGjirokastra region. First, qualitative data gathered through a questionnaire was used. Next, the 5-level Likertscale was used to measure it. Finally, the data was processed through statistical software SPSS version 21,using the binary logistic regression model, which reveals the probability of occurrence of an event when allindependent variables are included. Logistic regression is an integral part of a category of statistical models,which are called General Linear Models. Logistic regression is used to analyze problems in which one or moreindependent variables interfere, which influences the dichotomous dependent variable. In such cases, the latteris seen as the random variable and is dependent on them. To evaluate whether Albanian SMEs can identifyrisks, we analyzed the factors that SMEs perceive as directly affecting the risks they face. At the end of thepaper, we conclude that Albanian SMEs can identify risk


Biometrika ◽  
2021 ◽  
Author(s):  
Emre Demirkaya ◽  
Yang Feng ◽  
Pallavi Basu ◽  
Jinchi Lv

Summary Model selection is crucial both to high-dimensional learning and to inference for contemporary big data applications in pinpointing the best set of covariates among a sequence of candidate interpretable models. Most existing work assumes implicitly that the models are correctly specified or have fixed dimensionality, yet both are prevalent in practice. In this paper, we exploit the framework of model selection principles under the misspecified generalized linear models presented in Lv and Liu (2014) and investigate the asymptotic expansion of the posterior model probability in the setting of high-dimensional misspecified models.With a natural choice of prior probabilities that encourages interpretability and incorporates the Kullback–Leibler divergence, we suggest the high-dimensional generalized Bayesian information criterion with prior probability for large-scale model selection with misspecification. Our new information criterion characterizes the impacts of both model misspecification and high dimensionality on model selection. We further establish the consistency of covariance contrast matrix estimation and the model selection consistency of the new information criterion in ultra-high dimensions under some mild regularity conditions. The numerical studies demonstrate that our new method enjoys improved model selection consistency compared to its main competitors.


Biometrika ◽  
1986 ◽  
Vol 73 (2) ◽  
pp. 413-424 ◽  
Author(s):  
LEONARD A. STEFANSKI ◽  
RAYMOND J. CARROLL ◽  
DAVID RUPPERT

2013 ◽  
Vol 35 (1) ◽  
pp. 98
Author(s):  
Angela Radünz Lazzari

Air pollution is a risk factor for the population health. Its harmful effects on the population are observed even when the atmospheric pollutants are within the parameters set out in specific legislation, and they develop mainly through respiratory diseases. The aim of this study was to analyze the relationship between the concentrations of air pollutants and the incidence of respiratory diseases in the city of Porto Alegre, in 2005 and 2006. Applied multiple linear regression analysis, ordinal logistic regression and generalized linear models were used in the work. The results show good adjustment by the three techniques. The ordinal logistic regression detected only positive influence of air temperature and relative humidity in hospitalizations for respiratory diseases. Multiple linear regression related negatively hospitalizations with meteorological variables and positively with the particulate matter (PM10). The generalized linear model detected negative influence of meteorological variables and positive of pollutants, tropospheric ozone (O3) and PM10 in hospitalizations. Comparing the three statistical techniques to analyze the same data set, it can be concluded that all of them had a model with good fit to the data, but the technique of generalized linear models showed higher sensitivity in capturing the influence of pollutants, except in ordinal logistic regression and multiple linear regression.


2021 ◽  
Vol 10 (9) ◽  
pp. e8310917883
Author(s):  
Esttefani Duarte Brum ◽  
Gilberto Rodrigues Liska ◽  
Alisson Darós Santos

Can the time it takes a student to complete a test influence his / her performance? To answer this question, the logistic regression model was considered. In its development, evaluation was considered as a way of quantifying the performance of a student reflecting his degree of knowledge in a given content. For this we use records of the initial and final moments when developing an evaluation. The records of time spent were obtained from five different undergraduate classes, with subjects taught by the same teacher, with the same theoretical content, at the same university. The results confirm statistically that each additional minute that the student remains taking the test, implies in greater chances of obtaining good performance, as well as differences of performance between the feminine and masculine genders, although not statistically different, demonstrating that feminine students have greater chances of reaching the average. The model also confirms, according to the odds ratios that during the evaluations the students' performance decreases, having the best score in the first test. Through the references consulted, we understand that the difference in the grades of each student is influenced by several factors, the result of their own experiences.


Sign in / Sign up

Export Citation Format

Share Document