Random scaling factors in Bayesian distributional regression models with an application to real estate data

Distributional structured additive regression provides a flexible framework for modelling each parameter of a potentially complex response distribution in dependence of covariates. Structured additive predictors allow for an additive decomposition of covariate effects with non-linear effects and time trends, unit- or cluster-specific heterogeneity, spatial heterogeneity and complex interactions between covariates of different type. Within this framework, we present a simultaneous estimation approach for multiplicative random effects that allow for cluster-specific heterogeneity with respect to the scaling of a covariate′s effect. More specifically, a possibly non-linear function f( z) of a covariate z may be scaled by a multiplicative and possibly spatially correlated cluster-specific random effect (1+αc). Inference is fully Bayesian and is based on highly efficient Markov Chain Monte Carlo (MCMC) algorithms. We investigate the statistical properties of our approach within extensive simulation experiments for different response distributions. Furthermore, we apply the methodology to German real estate data where we identify significant district-specific scaling factors. According to the deviance information criterion, the models incorporating these factors perform significantly better than standard models without (spatially correlated) random scaling factors.

Download Full-text

PENERAPAN MODEL SPATIAL LOGIT-NORMAL PADA SMALL AREA ESTIMATION DENGAN METODE HIERARCHICAL BAYES

Seminar Nasional Official Statistics ◽

10.34123/semnasoffstat.v2019i1.42 ◽

2020 ◽

Vol 2019 (1) ◽

pp. 59-66

Author(s):

Taly Purwa

Keyword(s):

Monte Carlo ◽

Markov Chain ◽

Small Area ◽

Small Area Estimation ◽

Random Effect ◽

Deviance Information Criterion ◽

Information Criterion ◽

Hierarchical Bayes ◽

Area Estimation ◽

Spatial Random Effect

Penelitian ini menerapkan model Spatial Logit-normal pada Small Area Estimation (SAE) untuk estimasi proporsi penduduk dengan asupan kalori minimum di bawah 1.400 kkal/kapita/hari pada level kecamatan di Provinsi Bali Tahun 2014 yang merupakan indikator 2.1.2(A) pada tujuan ke-2 SDGs dalam rangka mengukur capaian dan mendukung tercapainya target SDGs pada level lebih tinggi. Terdapat tiga model SAE yang digunakan dengan spesifikasi random effect yang berbeda, yaitu model dengan random effect yang bersifat saling bebas (independen), spatial random effect (iCAR) serta model dengan kedua jenis random effect sekaligus (BYM). Penggunaan unsur spatial random effect diharapkan dapat meningkatkan efisiensi hasil estimasi. Metode estimasi menggunakan pendekatan Hierarchical Bayes (HB) dengan metode Markov Chain Monte Carlo (MCMC) algoritma Gibbs Sampling. Estimasi parameter pada ketiga model menunjukkan hasil yang relatif tidak berbeda dimana hanya ada satu variabel prediktor yang memiliki pengaruh signifikan, yaitu proporsi keluarga pertanian, pada model dengan random effect independen dan model BYM. Sedangkan pada model iCAR tidak ada satu pun variabel prediktor yang berpengaruh signifikan. Berdasarkan nilai Deviance Information Criterion (DIC), model terbaik adalah model BYM. Akan tetapi penambahan unsur spatial random effect bersamaan dengan random effect independen tidak secara signifikan dapat meningkatkan efisiensi hasil estimasi akibat dari minimnya nilai dependensi spasial Moran’s I. Secara visual, pemetaan hasil estimasi dengan model terbaik tidak menunjukkan adanya pola persebaran atau pengelompokan tertentu pada level kecamatan.

Download Full-text

Bayesian multi-scale modeling for aggregated disease mapping data

Statistical Methods in Medical Research ◽

10.1177/0962280215607546 ◽

2015 ◽

Vol 26 (6) ◽

pp. 2726-2742 ◽

Cited By ~ 7

Author(s):

Mehreteab Aregay ◽

Andrew B Lawson ◽

Christel Faes ◽

Russell S Kirby

Keyword(s):

Random Effect ◽

Disease Mapping ◽

Deviance Information Criterion ◽

Simulated Data ◽

Information Criterion ◽

Modeling Framework ◽

Multi Scale ◽

Shared Random Effects ◽

Convolution Model ◽

Resolution Level

In disease mapping, a scale effect due to an aggregation of data from a finer resolution level to a coarser level is a common phenomenon. This article addresses this issue using a hierarchical Bayesian modeling framework. We propose four different multiscale models. The first two models use a shared random effect that the finer level inherits from the coarser level. The third model assumes two independent convolution models at the finer and coarser levels. The fourth model applies a convolution model at the finer level, but the relative risk at the coarser level is obtained by aggregating the estimates at the finer level. We compare the models using the deviance information criterion (DIC) and Watanabe-Akaike information criterion (WAIC) that are applied to real and simulated data. The results indicate that the models with shared random effects outperform the other models on a range of criteria.

Download Full-text

Use of Bayesian Markov Chain Monte Carlo Methods to Model Kuwait Medical Genetic Center Data: An Application to Down Syndrome and Mental Retardation

Mathematics ◽

10.3390/math9030248 ◽

2021 ◽

Vol 9 (3) ◽

pp. 248

Author(s):

Reem Aljarallah ◽

Samer A Kharroubi

Keyword(s):

Monte Carlo ◽

Down Syndrome ◽

Mental Retardation ◽

Markov Chain ◽

Markov Chain Monte Carlo ◽

Univariate Analysis ◽

Deviance Information Criterion ◽

Predictive Ability ◽

Information Criterion ◽

Medical Genetic

Logit, probit and complementary log-log models are the most widely used models when binary dependent variables are available. Conventionally, these models have been frequentists. This paper aims to demonstrate how such models can be implemented relatively quickly and easily from a Bayesian framework using Gibbs sampling Markov chain Monte Carlo simulation methods in WinBUGS. We focus on the modeling and prediction of Down syndrome (DS) and Mental retardation (MR) data from an observational study at Kuwait Medical Genetic Center over a 30-year time period between 1979 and 2009. Modeling algorithms were used in two distinct ways; firstly, using three different methods at the disease level, including logistic, probit and cloglog models, and, secondly, using bivariate logistic regression to study the association between the two diseases in question. The models are compared in terms of their predictive ability via R2, adjusted R2, root mean square error (RMSE) and Bayesian Deviance Information Criterion (DIC). In the univariate analysis, the logistic model performed best, with R2 (0.1145), adjusted R2 (0.114), RMSE (0.3074) and DIC (7435.98) for DS, and R2 (0.0626), adjusted R2 (0.0621), RMSE (0.4676) and DIC (23120) for MR. In the bivariate case, results revealed that 7 and 8 out of the 10 selected covariates were significantly associated with DS and MR respectively, whilst none were associated with the interaction between the two outcomes. Bayesian methods are more flexible in handling complex non-standard models as well as they allow model fit and complexity to be assessed straightforwardly for non-nested hierarchical models.

Download Full-text

Response transformations for random effect and variance component models

Statistical Modelling ◽

10.1177/1471082x20966919 ◽

2020 ◽

pp. 1471082X2096691

Author(s):

Amani Almohaimeed ◽

Jochen Einbeck

Keyword(s):

Maximum Likelihood ◽

Random Effects ◽

Mixed Model ◽

Linear Mixed Model ◽

Random Effect ◽

Statistical Technique ◽

Response Distribution ◽

Level Data ◽

Variance Component Models ◽

Response Transformation

Random effect models have been popularly used as a mainstream statistical technique over several decades; and the same can be said for response transformation models such as the Box–Cox transformation. The latter aims at ensuring that the assumptions of normality and of homoscedasticity of the response distribution are fulfilled, which are essential conditions for inference based on a linear model or a linear mixed model. However, methodology for response transformation and simultaneous inclusion of random effects has been developed and implemented only scarcely, and is so far restricted to Gaussian random effects. We develop such methodology, thereby not requiring parametric assumptions on the distribution of the random effects. This is achieved by extending the ‘Nonparametric Maximum Likelihood’ towards a ‘Nonparametric profile maximum likelihood’ technique, allowing to deal with overdispersion as well as two-level data scenarios.

Download Full-text

PSXV-9 Transgenerational epigenetic variance for production and reproduction traits in maternal-line pigs

Journal of Animal Science ◽

10.1093/jas/skab235.472 ◽

2021 ◽

Vol 99 (Supplement_3) ◽

pp. 258-259

Author(s):

Jason R Graham ◽

Jay S Johnson ◽

Andre C Araujo ◽

Jeremy T Howard ◽

Luiz F Brito

Keyword(s):

Additive Genetic Variance ◽

Phenotypic Expression ◽

Deviance Information Criterion ◽

Information Criterion ◽

Pedigree Information ◽

Relationship Matrix ◽

Maternal Line ◽

Genetic Heritability ◽

Reproduction Traits ◽

Search Approach

Abstract Modeling epigenetic factors impacting phenotypic expression of economically important traits has become a hot-topic in the field of animal breeding due to the variability in genetic expression caused by environmental stressors (e.g., heat stress). This variability may be due, in part, to in-utero epigenomic remodeling, which has been reported to be passed from parent to offspring. We aimed to estimate transgenerational epigenetic variance for various production and reproduction traits measured in a maternal-line pig population, using a Bayesian approach. The phenotypes for production [n = 10,862; i.e., weaning weight (WW), birth weight (BW) and ultrasound-backfat thickness (BF)] and reproduction [n = 5,235, i.e., number of piglets born alive (NBA) and total number of piglets born (TB)] traits from a purebred Landrace population were provided by Smithfield Premium Genetics (NC, USA). The pedigree information traced back to 10 generations. Single-trait genetic analyses were performed using mixed models that included additive genetic, common environmental, and epigenetic random effects. The Gibbs sampler algorithm based on Markov chain Monte Carlo was used to estimate the variance components. The epigenetic relationship matrix was constructed using a recursive parameter (λ) related to the transmissibility coefficient of epigenetic markers. A grid search approach was used to define the optimal λ value (λ values ranged from 0.1 to 0.5, with an interval of 0.1). The optimal λ value was determined based on the deviance information criterion, and it was used to estimate the additive and epigenetic variances. For instance, based on preliminary results, the optimal λ value estimated for TB was 0.3 with an additive genetic variance of 0.94 (0.19 PSD) and epigenetic variance of 0.67 (0.18 PSD). The additive genetic heritability was 0.076 (0.015 PSD) and the estimated epigenetic heritability was 0.053 (0.015 PSD). This preliminary result suggests that epigenetics contribute to the non-Mendelian variability in pigs.

Download Full-text

Second-hand housing batch evaluation model of zhengzhou city based on big data and MGWR model

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210917 ◽

2021 ◽

pp. 1-20

Author(s):

Chaojie Liu ◽

Jie Lu ◽

Wenjing Fu ◽

Zhuoyi Zhou

Keyword(s):

Real Estate ◽

Geographically Weighted Regression ◽

Euclidean Distance ◽

Evaluation Model ◽

Housing Prices ◽

Information Criterion ◽

Access Point ◽

Least Square ◽

Weighted Regression ◽

Spatial Error

How to better evaluate the value of urban real estate is a major issue in the reform of real estate tax system. So the establishment of an accurate and efficient housing batch evaluation model is crucial in evaluating the value of housing. In this paper the second-hand housing transaction data of Zhengzhou City from 2010 to 2019 was used to model housing prices and explanatory variables by using models of Ordinary Least Square (OLS), Spatial Error Model (SEM), Geographically Weighted Regression (GWR), Geographically and Temporally Weighted Regression (GTWR), and Multiscale Geographically Weighted Regression (MGWR). And a correction method of Barrier Line and Access Point (BLAAP) was constructed, and compared with three correction methods previously studied: Buffer Area (BA), Euclidean Distance (ED), and Non-Euclidean Distance, Travel Distance (ND, TT). The results showed: The fitting degree of GWR, MGWR and GTWR by BLAAP was 0.03–0.07 higher than by ND. The fitting degree of MGWR was the highest (0.883) by BLAAP but the smallest by Akaike Information Criterion (AIC), and 88.3% of second-hand housing data could be well interpreted by the model.

Download Full-text

Utilização da modelagem inteiramente bayesiana na detecção de padrões de variação de risco relativo de mortalidade infantil no Rio Grande do Sul, Brasil

Cadernos de Saúde Pública ◽

10.1590/s0102-311x2009000700008 ◽

2009 ◽

Vol 25 (7) ◽

pp. 1501-1510 ◽

Cited By ~ 6

Author(s):

Sérgio Kakuta Kato ◽

Diego de Matos Vieira ◽

Jandyra Maria Guimarães Fachel

Keyword(s):

Deviance Information Criterion ◽

Rio Grande ◽

Information Criterion ◽

Rio Grande Do Sul ◽

Standardised Mortality Ratio ◽

Mortality Ratio

Neste artigo são analisados os fatores possivelmente associados à mortalidade infantil nos 496 municípios do Rio Grande do Sul, Brasil, com base em dados acumuladas entre os anos de 2001 a 2004, obtidos pela análise de regressão utilizando modelagem inteiramente bayesiana como alternativa para superar a autocorrelação espacial e a instabilidade dos estimadores clássicos, como a taxa bruta e a SMR (Standardised Mortality Ratio). Foram comparadas diferentes especificações de componente espacial e covariáveis, provenientes dos blocos do Índice de Desenvolvimento Sócio-econômico da Fundação de Economia e Estatística (IDESE/FEE-2003). Verificou-se que o modelo que utiliza a estrutura espacial além da covariável educação apresenta melhor desempenho, quando comparado pelo critério DIC (Deviance Information Criterion). Comparando as estimativas das SMR com os riscos relativos obtidos pela modelagem inteiramente bayesiana, foi possível observar um ganho substancial na interpretação e na detecção de padrões de variação do risco de mortalidade infantil nos municípios do Rio Grande do Sul ao utilizar essa modelagem. A região da Serra Gaúcha destacou-se com baixo risco relativo e estimativas muito homogêneas.

Download Full-text

Understanding historical summer flounder (Paralichthys dentatus) abundance patterns through the incorporation of oceanography-dependent vital rates in Bayesian hierarchical models

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/cjfas-2018-0092 ◽

2019 ◽

Vol 76 (8) ◽

pp. 1275-1294 ◽

Cited By ~ 3

Author(s):

Cecilia A. O’Leary ◽

Timothy J. Miller ◽

James T. Thorson ◽

Janet A. Nye

Keyword(s):

Population Dynamics ◽

Gulf Stream ◽

Stock Assessment ◽

Deviance Information Criterion ◽

Information Criterion ◽

Natural Mortality ◽

Vital Rates ◽

Summer Flounder ◽

Paralichthys Dentatus ◽

Abundance Patterns

Climate can impact fish population dynamics through changes in productivity and shifts in distribution, and both responses have been observed for many fish species. However, few studies have incorporated climate into population dynamics or stock assessment models. This study aimed to uncover how past variations in population vital rates and fishing pressure account for observed abundance variation in summer flounder (Paralichthys dentatus). The influences of the Gulf Stream Index, an index of climate variability in the Northwest Atlantic, on abundance were explored through natural mortality and stock–recruitment relationships in age-structured hierarchical Bayesian models. Posterior predictive loss and deviance information criterion indicated that out of tested models, the best estimates of summer flounder abundances resulted from the climate-dependent natural mortality model that included log-quadratic responses to the Gulf Stream Index. This climate-linked population model demonstrates the role of climate responses in observed abundance patterns and emphasizes the complexities of environmental effects on populations beyond simple correlations. This approach highlights the importance of modeling the combined effect of fishing and climate simultaneously to understand population dynamics.

Download Full-text