scholarly journals A simulation study of regression approaches for estimating risk ratios in the presence of multiple confounders

2021 ◽  
Vol 18 (1) ◽  
Author(s):  
Kanako Fuyama ◽  
Yasuhiro Hagiwara ◽  
Yutaka Matsuyama

Abstract Background Risk ratio is a popular effect measure in epidemiological research. Although previous research has suggested that logistic regression may provide biased odds ratio estimates when the number of events is small and there are multiple confounders, the performance of risk ratio estimation has yet to be examined in the presence of multiple confounders. Methods We conducted a simulation study to evaluate the statistical performance of three regression approaches for estimating risk ratios: (1) risk ratio interpretation of logistic regression coefficients, (2) modified Poisson regression, and (3) regression standardization using logistic regression. We simulated 270 scenarios with systematically varied sample size, the number of binary confounders, exposure proportion, risk ratio, and outcome proportion. Performance evaluation was based on convergence proportion, bias, standard error estimation, and confidence interval coverage. Results With a sample size of 2500 and an outcome proportion of 1%, both logistic regression and modified Poisson regression at times failed to converge, and the three approaches were comparably biased. As the outcome proportion or sample size increased, modified Poisson regression and regression standardization yielded unbiased risk ratio estimates with appropriate confidence intervals irrespective of the number of confounders. The risk ratio interpretation of logistic regression coefficients, by contrast, became substantially biased as the outcome proportion increased. Conclusions Regression approaches for estimating risk ratios should be cautiously used when the number of events is small. With an adequate number of events, risk ratios are validly estimated by modified Poisson regression and regression standardization, irrespective of the number of confounders.

2013 ◽  
Vol 26 (5) ◽  
pp. 505
Author(s):  
Pedro Aguiar ◽  
Baltazar Nunes

Introduction: It is very important to review the meaning of the Odds Ratio as a measure of effect and association, as well as, the bias of the Odds Ratio when it is assumed as a risk ratio or a prevalence ratio in the case of frequent disease or frequent health outcome.Material and Methods: We simulated in a cohort of 200 individuals with 100 exposed and 100 non-exposed to a risk factor, a first setting of rare disease and a second setting of a more frequent disease. In both settings the risk ratios were similar. We computed the Odds Ratio and Relative Risks by the classical approach (standard method) and respectively by logistic regression and Poisson regression. After these, we introduced in the cohort a confounding variable and then we computed the Odds Ratio and Relative Risk by Mantel-Hanszel stratified analysis (standard method) and respectively by multiple logistic regression and multiple Poisson regression.We used the 95% confidence interval in parameter estimation and SPSS V20 was used in statistical analysis.Results: In the case of rare disease the Odds Ratio was very close to the Relative Risk. For more frequent disease the Odds Ratio overestimated the Relative Risk. In this situation and with a confounding variable, the relative Risk adjusted by Poisson regression was more valid then the Odds Ratio to represent a risk ratio. The confidence intervals of the Relative Risk adjusted by Poisson regression were always greater than Mantel-Hanszel confidence intervals.Conclusions: The Odds Ratio and multiple logistic regression were valid analytic procedures in several epidemiological designs such as case-control studies and exploratory prospective studies as well as exploratory cross-sectional studies. The Odds Ratio should not be interpreted as a risk ratio or a prevalence ratio in the case of a health outcome that it is not rare. The multiple Poisson regression should be considered as an alternative procedure to logistic regression, especially if we want to estimate the effect of a specific exposure to a risk factor.


2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Di Shu ◽  
Jessica G. Young ◽  
Sengwee Toh

Abstract Background Multi-center studies can generate robust and generalizable evidence, but privacy considerations and legal restrictions often make it challenging or impossible to pool individual-level data across data-contributing sites. With binary outcomes, privacy-protecting distributed algorithms to conduct logistic regression analyses have been developed. However, the risk ratio often provides a more transparent interpretation of the exposure-outcome association than the odds ratio. Modified Poisson regression has been proposed to directly estimate adjusted risk ratios and produce confidence intervals with the correct nominal coverage when individual-level data are available. There are currently no distributed regression algorithms to estimate adjusted risk ratios while avoiding pooling of individual-level data in multi-center studies. Methods By leveraging the Newton-Raphson procedure, we adapted the modified Poisson regression method to estimate multivariable-adjusted risk ratios using only summary-level information in multi-center studies. We developed and tested the proposed method using both simulated and real-world data examples. We compared its results with the results from the corresponding pooled individual-level data analysis. Results Our proposed method produced the same adjusted risk ratio estimates and standard errors as the corresponding pooled individual-level data analysis without pooling individual-level data across data-contributing sites. Conclusions We developed and validated a distributed modified Poisson regression algorithm for valid and privacy-protecting estimation of adjusted risk ratios and confidence intervals in multi-center studies. This method allows computation of a more interpretable measure of association for binary outcomes, along with valid construction of confidence intervals, without sharing of individual-level data.


Methodology ◽  
2007 ◽  
Vol 3 (3) ◽  
pp. 89-99 ◽  
Author(s):  
A. Palmer ◽  
J.M. Losilla ◽  
J. Vives ◽  
R. Jiménez

Abstract. This simulation study compares different strategies to solve the problem of underestimating standard errors in the Poisson regression model when overdispersion is present. The study analyses the importance of sample size, Poisson distribution mean, and dispersion parameter in choosing the best index or estimate. Results show that standard error (SE) estimates obtained by resampling (nonparametric bootstrap and jackknife) are the least biased, followed by the direct index based on the χ2, and the so-called robust indexes, in third place. Nevertheless, the inefficiency of resampling estimates is also evident, especially in small samples.


2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
M Mahendran ◽  
G Bauer ◽  
D Lizotte ◽  
Y Zhu

Abstract Introduction This study evaluated seven quantitative methods for their predictive accuracy for intersectionally defined subgroups, via a simulation study. The methods were single-level regression with interaction terms, cross-classification, multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA), and four decision tree Methods classification and regression trees (CART), conditional inference trees, chi-square automatic interaction detector, and random forest. Also evaluated was how well methods identified variables relevant to the outcome. An example analysis will be presented using data from the U.S. National Health and Nutritional Examination Survey. Methods The simulated datasets varied by outcome variable type (binary and continuous), input variable types, sample size, and size and direction of the effects. Accuracy was evaluated using mean squared error or mean absolute percentage error. The secondary outcome was evaluated via significance and confidence interval coverage of regression terms and variable selection of the machine learning methods. Results Predictive accuracy improved with increasing sample size for all methods except CART. At small sample sizes random forest and MAIHDA generally created the most precise predictions. Variable selection consistently faced a high type 1 error for CTree and CHAID. While performing well for prediction, variable selection by random forest and confidence interval coverage and power of MAIHDA main effects coefficients were suboptimal. Discussion From this study emerge recommendations for applying methods in quantitative intersectionality. Different methodologies are optimal for different purposes, for example while random forest and MAIHDA performed well for prediction, they were less reliable for variable identification. In our discussion, we will work through how to select, apply, and interpret methodologies to achieve analytic goals that align with intersectionality theory.


Sign in / Sign up

Export Citation Format

Share Document