Using numerical methods to design simulations: revisiting the balancing intercept

American Journal of Epidemiology ◽

10.1093/aje/kwab264 ◽

2021 ◽

Author(s):

Sarah E Robertson ◽

Issa J Dahabreh ◽

Jon A Steingrimsson

Keyword(s):

Regression Model ◽

Logistic Model ◽

Numerical Approximation ◽

Basic Problem ◽

Random Variable ◽

Analytical Approximation ◽

Simulation Studies ◽

Linear Predictor ◽

Binary Random Variable ◽

A Value

Abstract We consider methods for generating draws of a binary random variable whose expectation conditional on covariates follows a logistic regression model with known covariate coefficients. We examine approximations for finding a “balancing intercept,” that is, a value for the intercept of the logistic model that leads to a desired marginal expectation for the binary random variable. We show that a recently proposed analytical approximation can produce inaccurate results, especially when targeting more extreme marginal expectations or when the linear predictor of the regression model has high variance. We describe and implement a numerical approximation based on Monte Carlo methods that appears to work well in practice. Our approach to the basic problem of the balancing intercept provides an example of a broadly applicable strategy for formulating and solving problems that arise in the design of simulation studies used to evaluate or teach epidemiologic methods.

Download Full-text

Inferences About the Probability of Success, Given the Value of a Covariate, Using a Nonparametric Smoother

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1556670240 ◽

2020 ◽

Vol 18 (1) ◽

Author(s):

Rand Wilcox

Keyword(s):

Logistic Regression ◽

Confidence Interval ◽

Regression Model ◽

Goodness Of Fit ◽

Logistic Regression Model ◽

Random Variable ◽

Goodness Of Fit Test ◽

Binary Random Variable ◽

Alternative Approach ◽

Probability Of Success

For a binary random variable Y, let p(x) = P(Y = 1 | X = x) for some covariate X. The goal of computing a confidence interval for p(x) is considered. In the logistic regression model, even a slight departure difficult to detect via a goodness-of-fit test can yield inaccurate results. The accuracy of a confidence interval can deteriorate as the sample size increases. The goal is to suggest an alternative approach based on a smoother, which provides a more flexible approximation of p(x).

Download Full-text

Modelling Assumed Metric Paired Comparison Data - Application to Learning Related Emotions

Austrian Journal of Statistics ◽

10.17713/ajs.v44i1.25 ◽

2014 ◽

Vol 44 (1) ◽

pp. 3-15 ◽

Cited By ~ 1

Author(s):

Alexandra Grand ◽

Regina Dittrich

Keyword(s):

Regression Model ◽

Beta Distribution ◽

Paired Comparison ◽

Random Variable ◽

Open Unit ◽

Student Survey ◽

Data Set ◽

Linear Predictor ◽

Data Application ◽

The Mean

In this article we suggest a beta regression model that accounts for the degree of preference in paired comparisons measured on a bounded metric paired comparison scale. The beta distribution for bounded continuous random variables assumes values in the open unit interval (0,1). However, in practice we will observe paired comparison responses that lie within a fixed or arbitrary fixed interval [-a,a] with known value of a. We therefore transform the observed responses into the interval (0,1) and assume that these transformed responses are each a realization of a random variable which follows a beta distribution. We propose a simple paired comparison regression model for beta distributed variables which allows us to model the mean of the transformed response using a linear predictor and a logit link function -- where the linear predictor is defined by the parameters of the logit-linear Bradley-Terry model. For illustration we applied the presented model to a data set obtained from a student survey of learning related emotions in mathematics.

Download Full-text

On the Arcsecant Hyperbolic Normal Distribution. Properties, Quantile Regression Modeling and Applications

Symmetry ◽

10.3390/sym13010117 ◽

2021 ◽

Vol 13 (1) ◽

pp. 117 ◽

Cited By ~ 1

Author(s):

Mustafa Ç. Korkmaz ◽

Christophe Chesneau ◽

Zehra Sedef Korkmaz

Keyword(s):

Quantile Regression ◽

Regression Model ◽

Normal Distribution ◽

Order Statistics ◽

Random Variable ◽

Unit Interval ◽

Parametric Estimation ◽

Simulation Studies ◽

Quantile Regression Model ◽

New Distribution

This work proposes a new distribution defined on the unit interval. It is obtained by a novel transformation of a normal random variable involving the hyperbolic secant function and its inverse. The use of such a function in distribution theory has not received much attention in the literature, and may be of interest for theoretical and practical purposes. Basic statistical properties of the newly defined distribution are derived, including moments, skewness, kurtosis and order statistics. For the related model, the parametric estimation is examined through different methods. We assess the performance of the obtained estimates by two complementary simulation studies. Also, the quantile regression model based on the proposed distribution is introduced. Applications to three real datasets show that the proposed models are quite competitive in comparison to well-established models.

Download Full-text

Modeling Proportion Data with Inflation by Using a Power-Skew-Normal/Logit Mixture Model

Mathematics ◽

10.3390/math9161989 ◽

2021 ◽

Vol 9 (16) ◽

pp. 1989

Author(s):

Guillermo Martínez-Flórez ◽

Hector W. Gomez ◽

Roger Tovar-Falón

Keyword(s):

Regression Model ◽

Logistic Model ◽

Information Matrix ◽

Link Function ◽

Skew Normal Distribution ◽

Linear Predictor ◽

Response Variable ◽

Proposed Model ◽

Flexible Model ◽

Skew Normal

Rate or proportion data are modeled by using a regression model. The considered regression model can be used for studying phenomena with a response on the (0, 1), [0, 1), (0, 1], or [0, 1] intervals. To connect the response variable with the linear predictor in the regression model, we use a logit link function, which guarantees that the obtained prediction ranges between zero and one in the cases inflated at zero or one (or both). The model is complemented with the assumption that the errors follow a power-skew-normal distribution, resulting in a very flexible model, and with a non-singular information matrix, constituting an advantage over other existing models in the literature. To explain the probability of point mass at the values zero and/or one (inflated part), we used a polytomic logistic model with covariates. The results of two illustrations showed that the proposed model is a better alternative compared to widely known models in the literature.

Download Full-text

Estimation of Regression Parameters from Noise Multiplied Data

Journal of Privacy and Confidentiality ◽

10.29012/jpc.v4i2.622 ◽

2013 ◽

Vol 4 (2) ◽

Author(s):

Yan-Xia Lin ◽

Phillip Wise

Keyword(s):

Regression Analysis ◽

Regression Model ◽

Real Life ◽

Simulation Studies ◽

Independent Variables ◽

Multiplicative Noises ◽

Life Data ◽

Regression Parameters ◽

Data Application ◽

Real Life Data

This paper considers the scenario that all data entries in a confidentialised unit record file were masked by multiplicative noises, regardless of whether unit records are sensitive or not and regardless of whether the masked variables are dependent or independent variables in the underlying regression analysis. A technique is introduced in this paper to show how to estimate parameters in a regression model, which is originally fitted by unmasked data, based on masked data. Several simulation studies and a real-life data application are presented.

Download Full-text

Forecasting COVID-19 cases in Algeria using logistic growth and polynomial regression models

10.21203/rs.3.rs-223608/v1 ◽

2021 ◽

Author(s):

Mohamed LOUNIS ◽

Babu Malavika

Keyword(s):

Regression Model ◽

Logistic Model ◽

Polynomial Regression ◽

Logistic Growth ◽

The Novel ◽

Logistic Growth Model ◽

Polynomial Regression Models ◽

Using Data ◽

Novel Coronavirus ◽

Polynomial Regression Model

Abstract The novel Coronavirus respiratory disease 2019 (COVID-19) is still expanding through the world since it started in Wuhan (China) on December 2019 reporting a number of more than 84.4 millions cases and 1.8 millions deaths on January 3rd 2021.In this work and to forecast the COVID-19 cases in Algeria, we used two models: the logistic growth model and the polynomial regression model using data of COVID-19 cases reported by the Algerian ministry of health from February 25th to December 2nd, 2020. Results showed that the polynomial regression model fitted better the data of COVID-19 in Algeria the Logistic model. The first model estimated the number of cases on January, 19th 2021 at 387673 cases. This model could help the Algerian authorities in the fighting against this disease.

Download Full-text

Compression and Conditional Effects: A Product Term Is Essential When Using Logistic Regression to Test for Interaction

Political Science Research and Methods ◽

10.1017/psrm.2015.59 ◽

2015 ◽

Vol 4 (3) ◽

pp. 621-639 ◽

Cited By ~ 13

Author(s):

Carlisle Rainey

Keyword(s):

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Simulation Studies ◽

Political Methodology ◽

Conditional Effects ◽

Product Term

Previous research in political methodology argues that researchers do not need to include a product term in a logistic regression model to test for interaction if they suspect interaction due to compression alone. I disagree with this claim and offer analytical arguments and simulation evidence that when researchers incorrectly theorize interaction due to compression, models without a product term bias the researcher, sometimes heavily, toward finding interaction. However, simulation studies also show that models with a product term fit a broad range of non-interactive relationships surprisingly well, enabling analysts to remove most of the bias toward finding interaction by simply including a product term.

Download Full-text

PENGARUH LABEL HALAL TERHADAP KEPUTUSAN MASYARAKAT MEMBELI PRODUK MAKANAN DAN MINUMAN (Studi Kasus Lingkungan VI Kelurahan Nangka Binjai Utara)

AT-TAWASSUTH: Jurnal Ekonomi Islam ◽

10.30829/ajei.v5i2.8447 ◽

2020 ◽

Vol 5 (2) ◽

pp. 354

Author(s):

Raja Sakti Putra Harahap

Keyword(s):

Regression Model ◽

Quantitative Method ◽

Statistical Tests ◽

A Value ◽

Regression Formula ◽

Simple Regression Model ◽

Food And Beverage ◽

R Value ◽

Positive Effect ◽

Simple Regression

This study aims to determine how the effect of the halal label on people’s decisions to buy food and beverage products. The method used is a quantitative method with a simple regression model and using statistical tests with the help of IBM SPSS Statistics 22 for windows. The sample in this study is the neighborhood community VI Nangka Village as many as 70 respondents. The results showed that the calculated r value was 0,79, so it could be saidthat there was s relationship or correlation between the variables X (Halal Label) with the variable Y ( The decision to buy food and beverage products). Then the t value < t table, which has a value of 0,657 < 1,668. Then is accepted and is rejected, which means that partially (X) variable does not have a significant effect on variable (Y), where the results of the hypothesis are accepted and proven after being calculated using a simple regression formula, namely Y = 34,7 + 0,67X. By having a regression coefficoent of 0,675%, so the halal label has a positive effect on decisions to buy food and beverage products.

Download Full-text

On the Generalization Performance of a Regression Model with Imprecise Elements

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488517500313 ◽

2017 ◽

Vol 25 (05) ◽

pp. 723-740 ◽

Cited By ~ 3

Author(s):

Maria Brigida Ferraro

Keyword(s):

Linear Regression ◽

Regression Model ◽

Prediction Error ◽

Random Element ◽

Fuzzy Random Variable ◽

Random Variable ◽

Generalization Performance ◽

Bootstrap Procedure ◽

Link Functions ◽

Fuzzy Random

A linear regression model for imprecise random variables is considered. The imprecision of a random element has been formalized by means of the LR fuzzy random variable, characterized by a center, a left and a right spread. In order to avoid the non-negativity conditions the spreads are transformed by means of two invertible functions. To analyze the generalization performance of that model an appropriate prediction error is introduced, and it is estimated by means of a bootstrap procedure. Furthermore, since the choice of response transformations could affect the inferential procedures, a computational proposal is introduced for choosing from a family of parametric link functions, the Box-Cox family, the transformation parameters that minimize the prediction error of the model.

Download Full-text

Flexible quasi-beta regression models for continuous bounded data

Statistical Modelling ◽

10.1177/1471082x18790847 ◽

2018 ◽

Vol 19 (6) ◽

pp. 617-633 ◽

Cited By ~ 1

Author(s):

Wagner H Bonat ◽

Ricardo R Petterle ◽

John Hinde ◽

Clarice GB Demétrio

Keyword(s):

Regression Model ◽

Regression Models ◽

Beta Regression ◽

Estimating Function ◽

Dispersion Parameters ◽

Linear Predictor ◽

Mean And Variance ◽

The Mean ◽

Bounded Data ◽

Beta Regression Model

We propose a flexible class of regression models for continuous bounded data based on second-moment assumptions. The mean structure is modelled by means of a link function and a linear predictor, while the mean and variance relationship has the form [Formula: see text], where [Formula: see text], [Formula: see text] and [Formula: see text] are the mean, dispersion and power parameters respectively. The models are fitted by using an estimating function approach where the quasi-score and Pearson estimating functions are employed for the estimation of the regression and dispersion parameters respectively. The flexible quasi-beta regression model can automatically adapt to the underlying bounded data distribution by the estimation of the power parameter. Furthermore, the model can easily handle data with exact zeroes and ones in a unified way and has the Bernoulli mean and variance relationship as a limiting case. The computational implementation of the proposed model is fast, relying on a simple Newton scoring algorithm. Simulation studies, using datasets generated from simplex and beta regression models show that the estimating function estimators are unbiased and consistent for the regression coefficients. We illustrate the flexibility of the quasi-beta regression model to deal with bounded data with two examples. We provide an R implementation and the datasets as supplementary materials.

Download Full-text