Beta-binomial models for meta-analysis with binary outcomes: Variations, extensions, and additional insights from econometrics

Research Methods in Medicine & Health Sciences ◽

10.1177/2632084321996225 ◽

2021 ◽

pp. 263208432199622

Author(s):

Tim Mathes ◽

Oliver Kuss

Keyword(s):

Simulation Study ◽

Count Data ◽

Negative Binomial ◽

Meta Analysis ◽

Negative Binomial Regression ◽

Binary Outcomes ◽

Small Scale ◽

Panel Count Data ◽

Count Data Models ◽

Meta Analyses

Background Meta-analysis of systematically reviewed studies on interventions is the cornerstone of evidence based medicine. In the following, we will introduce the common-beta beta-binomial (BB) model for meta-analysis with binary outcomes and elucidate its equivalence to panel count data models. Methods We present a variation of the standard “common-rho” BB (BBST model) for meta-analysis, namely a “common-beta” BB model. This model has an interesting connection to fixed-effect negative binomial regression models (FE-NegBin) for panel count data. Using this equivalence, it is possible to estimate an extension of the FE-NegBin with an additional multiplicative overdispersion term (RE-NegBin), while preserving a closed form likelihood. An advantage due to the connection to econometric models is, that the models can be easily implemented because “standard” statistical software for panel count data can be used. We illustrate the methods with two real-world example datasets. Furthermore, we show the results of a small-scale simulation study that compares the new models to the BBST. The input parameters of the simulation were informed by actually performed meta-analysis. Results In both example data sets, the NegBin, in particular the RE-NegBin showed a smaller effect and had narrower 95%-confidence intervals. In our simulation study, median bias was negligible for all methods, but the upper quartile for median bias suggested that BBST is most affected by positive bias. Regarding coverage probability, BBST and the RE-NegBin model outperformed the FE-NegBin model. Conclusion For meta-analyses with binary outcomes, the considered common-beta BB models may be valuable extensions to the family of BB models.

Get full-text (via PubEx)

Statistical models for analyzing count data: predictors of length of stay among HIV patients in Portugal using a multilevel model

BMC Health Services Research ◽

10.1186/s12913-021-06389-1 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Ahmed Nabil Shaaban ◽

Bárbara Peleteiro ◽

Maria Rosario O. Martins

Keyword(s):

Length Of Stay ◽

Regression Model ◽

Random Effects ◽

Count Data ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Comprehensive Approach ◽

Negative Binomial Regression Model ◽

Hiv Patients ◽

Binomial Regression

Abstract Background This study offers a comprehensive approach to precisely analyze the complexly distributed length of stay among HIV admissions in Portugal. Objective To provide an illustration of statistical techniques for analysing count data using longitudinal predictors of length of stay among HIV hospitalizations in Portugal. Method Registered discharges in the Portuguese National Health Service (NHS) facilities Between January 2009 and December 2017, a total of 26,505 classified under Major Diagnostic Category (MDC) created for patients with HIV infection, with HIV/AIDS as a main or secondary cause of admission, were used to predict length of stay among HIV hospitalizations in Portugal. Several strategies were applied to select the best count fit model that includes the Poisson regression model, zero-inflated Poisson, the negative binomial regression model, and zero-inflated negative binomial regression model. A random hospital effects term has been incorporated into the negative binomial model to examine the dependence between observations within the same hospital. A multivariable analysis has been performed to assess the effect of covariates on length of stay. Results The median length of stay in our study was 11 days (interquartile range: 6–22). Statistical comparisons among the count models revealed that the random-effects negative binomial models provided the best fit with observed data. Admissions among males or admissions associated with TB infection, pneumocystis, cytomegalovirus, candidiasis, toxoplasmosis, or mycobacterium disease exhibit a highly significant increase in length of stay. Perfect trends were observed in which a higher number of diagnoses or procedures lead to significantly higher length of stay. The random-effects term included in our model and refers to unexplained factors specific to each hospital revealed obvious differences in quality among the hospitals included in our study. Conclusions This study provides a comprehensive approach to address unique problems associated with the prediction of length of stay among HIV patients in Portugal.

Get full-text (via PubEx)

A simulation study for count data models under varying degrees of outliers and zeros

Communications in Statistics - Simulation and Computation ◽

10.1080/03610918.2018.1498886 ◽

2018 ◽

Vol 49 (4) ◽

pp. 1078-1088 ◽

Cited By ~ 2

Author(s):

Fatih Tüzen ◽

Semra Erbaş ◽

Hülya Olmuş

Keyword(s):

Simulation Study ◽

Count Data ◽

Data Models ◽

Count Data Models

Get full-text (via PubEx)

Managing Inflation

Crime & Delinquency ◽

10.1177/0011128716679796 ◽

2016 ◽

Vol 63 (1) ◽

pp. 77-87 ◽

Cited By ~ 5

Author(s):

William H. Fisher ◽

Stephanie W. Hartwell ◽

Xiaogang Deng

Keyword(s):

Count Data ◽

Regression Models ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Excess Zeros ◽

Binomial Regression ◽

Negative Binomial Models ◽

Using Data ◽

Over Dispersion ◽

Binomial Models

Poisson and negative binomial regression procedures have proliferated, and now are available in virtually all statistical packages. Along with the regression procedures themselves are procedures for addressing issues related to the over-dispersion and excessive zeros commonly observed in count data. These approaches, zero-inflated Poisson and zero-inflated negative binomial models, use logit or probit models for the “excess” zeros and count regression models for the counted data. Although these models are often appropriate on statistical grounds, their interpretation may prove substantively difficult. This article explores this dilemma, using data from a study of individuals released from facilities maintained by the Massachusetts Department of Correction.

Get full-text (via PubEx)

Analysis of Road Traffic Accidents in the Punjab by Using Panel Count Data Models

STATISTICS, COMPUTING AND INTERDISCIPLINARY RESEARCH ◽

10.52700/scir.v3i1.23 ◽

2021 ◽

Vol 3 (1) ◽

pp. 1-13

Author(s):

Muhammad Anus Hayat Khan ◽

Ijaz Hussain

Keyword(s):

Count Data ◽

Data Model ◽

Road Safety ◽

Traffic Accidents ◽

Road Traffic ◽

Data Models ◽

Panel Count Data ◽

Road Traffic Accidents ◽

Count Data Models ◽

Count Data Model

Each year more than three thousand people die and get serious injuries in traffic accidents. Count data model provide more precise tools for planners and decision makers to conduct proactive road safety planning.We analyzed the exploratory research of Road Traffic Accidents (RTAs) and furthermore explores the factors affecting the RTAs frequency in 36 districts of the Punjab over a time period of three years (July 1, 2013 June 30, 2016) with monthly data using panel count data models. Among the models considered, the random parameters Poisson panel count data model is found to fit the data best. The exploratory analysis shows that highly dense populated districts with large number of registered vehicles causes more accidents as compared to low density populated districts. It is found that, most of the variables used to control the variation in the frequency of RTAs counts play vital role with higher significance levels. The application of regression analysis and modeling of RTAs at district level in Punjab will help to identification of districts with high RTAs rates and this could help more efficient road safety management in the Punjab.

Get full-text (via PubEx)

Interval estimation of the overall treatment effect in random-effects meta-analyses: Recommendations from a simulation study comparing frequentist, Bayesian, and bootstrap methods

10.31219/osf.io/5zbh6 ◽

2020 ◽

Author(s):

Frank Weber ◽

Guido Knapp ◽

Anne Glass ◽

Günther Kundt ◽

Katja Ickstadt

Keyword(s):

Literature Review ◽

Random Effects ◽

Simulation Study ◽

Treatment Effect ◽

Profile Likelihood ◽

Meta Analysis ◽

Artery Bypass ◽

Interval Length ◽

Interval Estimators ◽

Meta Analyses

There exists a variety of interval estimators for the overall treatment effect in a random-effects meta-analysis. A recent literature review summarizing existing methods suggested that in most situations, the Hartung-Knapp/Sidik-Jonkman (HKSJ) method was preferable. However, a quantitative comparison of those methods in a common simulation study is still lacking. Thus, we conduct such a simulation study for continuous and binary outcomes, focusing on the medical field for application.Based on the literature review and some new theoretical considerations, a practicable number of interval estimators is selected for this comparison: the classical normal-approximation interval using the DerSimonian-Laird heterogeneity estimator, the HKSJ interval using either the Paule-Mandel or the Sidik-Jonkman heterogeneity estimator, the Skovgaard higher-order profile likelihood interval, a parametric bootstrap interval, and a Bayesian interval using different priors. We evaluate the performance measures (coverage and interval length) at specific points in the parameter space, i.e. not averaging over a prior distribution. In this sense, our study is conducted from a frequentist point of view.We confirm the main finding of the literature review, the general recommendation of the HKSJ method (here with the Sidik-Jonkman heterogeneity estimator). For meta-analyses including only 2 studies, the high length of the HKSJ interval limits its practical usage. In this case, the Bayesian interval using a weakly informative prior for the heterogeneity may help. Our recommendations are illustrated using a real-world meta-analysis dealing with the efficacy of an intramyocardial bone marrow stem cell transplantation during coronary artery bypass grafting.

Get full-text (via PubEx)

A Combined PLS and Negative Binomial Regression Model for Inferring Association Networks from Next-Generation Sequencing Count Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2017.2665495 ◽

2018 ◽

Vol 15 (3) ◽

pp. 760-773 ◽

Cited By ~ 2

Author(s):

Maiju Pesonen ◽

Jaakko Nevalainen ◽

Steven Potter ◽

Somnath Datta ◽

Susmita Datta

Keyword(s):

Next Generation Sequencing ◽

Regression Model ◽

Count Data ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Negative Binomial Regression Model ◽

Next Generation ◽

Binomial Regression ◽

Generation Sequencing

Get full-text (via PubEx)

PEMODELAN DENGAN GEOGRAPHICALLY WEIGHTED NEGATIVE BINOMIAL REGRESSION (Studi kasus: Banyaknya Penderita Kusta di Jawa Barat)

Xplore Journal of Statistics ◽

10.29244/xplore.v10i3.833 ◽

2021 ◽

Vol 10 (3) ◽

pp. 226-236

Author(s):

Khusnul Khotimah ◽

Itasia Dina Sulvianti ◽

Pika Silvianti

Keyword(s):

Regression Model ◽

Count Data ◽

Poisson Regression ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Kernel Weight ◽

Negative Binomial Regression Model ◽

West Java ◽

Binomial Regression ◽

Spatial Heterogenity

The number of leper in West Java is an example of the count data case. The analyzes commonly used in count data is Poisson regression. This research will determine the variables that influence the number of leper in West Java. The data used is the number of leper in West Java in 2019. This data has an overdispersion condition and spatial heterogenity. To handle overdispersion, the negative binomial regression model can be employed. While spatial heterogenity is overcome by adding adaptive bisquare kernel weight. This research resulted Geographically Weighted Negative Binomial Regression (GWNBR) with a weighting adaptive bisquare kernel classifies regency/city in West Java into ten groups based on the variables that sigfinicantly influence the number of leper. In general, the variable in the percentage of households with Clean and Healthy Behavior (PHBS) has a significant effect in all regency/city in West Java. Especially for Bogor Regency, Depok City, Bogor City, and Pangandaran Regency, the variable of the percentage of people poverty does not have a significant effect on the number leper.

Get full-text (via PubEx)

Semiparametric methods for regression analysis of panel count data and mixed panel count data

10.32469/10355/63796 ◽

2017 ◽

Author(s):

◽

Guanglei Yu

Keyword(s):

Regression Analysis ◽

Simulation Study ◽

Count Data ◽

Recurrent Events ◽

Recurrent Event ◽

Mixed Data ◽

Panel Count Data ◽

Event Data ◽

Recurrent Event Data ◽

Recurrent Event Process

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Recurrent event data and panel count data are two common types of data that have been studied extensively in event history studies in literature. By recurrent event data, we mean that subjects are observed continuously in the follow-up study and thus occurrence times of recurrent events of interest are available. For panel count data, subjects are monitored periodically at discrete observation times and thus only numbers of recurrent events between two subsequent observations are recorded. In addition, one may face mixed panel count data in practice, which are the mixture of recurrent event data and panel count data. They arise when each study subject may be observed continuously during the whole study period, continuously over some study periods and at some time points otherwise, or only at some discrete time points. That is, these mixed data provide complete or incomplete information on the recurrent event process over different time periods for different subjects. It is well-known that in panel count data, the observation process may carry information on the underlying recurrent event process and the censoring may also be dependent in practice. Under such circumstance, the first part of this dissertation will discuss regression analysis of panel count data with informative observations and drop-outs. For the problem, a general means model is presented that can allow both additive and multiplicative effects of covariates on the underlying recurrent event process. In addition, the proportional rates model and the accelerated failure time model are employed to describe the covariate effects on the observation process and the dropout or follow-up process, respectively. For estimation of regression parameters, some estimating equation-based procedures are developed and the asymptotic properties of the proposed estimators are established. In addition, a resampling approach is proposed for the estimation of the covariance matrix of the proposed estimator and a model checking procedure is also provided. The results from an extensive simulation study indicate that the proposed methodology works well for practical situations and it is applied to a motivated set of real data from the Childhood Cancer Survivor Study (CCSS) given in Section 1.1.2.2. In the second part of this dissertation, we will consider regression analysis of mixed panel count data. One major problem in the statistical inference on the mixed data is to combine these two different types of data structures. Since panel count data can be viewed as interval-censored recurrent event data with exact occurrence times of events of interest unobserved or missing, they may be augmented by filling in those missing data by imputation. Then the mixed data can be converted to recurrent event data on which the existing statistical inference method can be easily implemented. Motivated by this, a multiple imputation-based estimation approach is proposed. A simulation study is conducted to study the finite-sample properties of the proposed methodology and it shows that the proposed method is more efficient than the existing method. Also, an illustrative example from the CCSS is provided. The third part of this dissertation still considers regression analysis of mixed panel count data but in the presence of a dependent terminal event, which precludes further occurrence of either recurrent events of interest or observations. For this problem, we present a marginal modeling approach which acknowledges the fact that there will be no more recurrent events after the terminal event and leaves the correlation structure unspecified. To estimate the parameters of interest, an estimating equation-based procedure is developed and the inverse probability of survival weighting technique is used. Asymptotic properties of proposed estimators are also established and finite-sample properties are assessed in a simulation study. We again apply this proposed methodology to the CCSS. In the last part of this dissertation, we will discuss some work directions of the future research.

Get full-text (via PubEx)

Mixed-Effects Negative Binomial Regression with Interval Censoring: A Simulation Study and Application to Aridity and All-Cause Mortality Among Black South Africans Over 1997–2013

10.1007/978-3-030-72437-5_17 ◽

2021 ◽

pp. 381-413

Author(s):

Christian M. Landon ◽

Robert H. Lyles ◽

Noah C. Scovronick ◽

Azar M. Abadi ◽

Rocky Bilotta ◽

...

Keyword(s):

Simulation Study ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Interval Censoring ◽

Mixed Effects ◽

South Africans ◽

Binomial Regression ◽

Black South Africans ◽

All Cause Mortality

Get full-text (via PubEx)

Using Count Data Models to Predict Epiphytic Bryophyte Recruitment in Schima superba Gardn. et Champ. Plantations in Urban Forests

Forests ◽

10.3390/f11020174 ◽

2020 ◽

Vol 11 (2) ◽

pp. 174

Author(s):

Dexian Zhao ◽

Zhenkai Sun ◽

Cheng Wang ◽

Zezhou Hao ◽

Baoqiang Sun ◽

...

Keyword(s):

Count Data ◽

Human Disturbance ◽

Negative Binomial ◽

Urban Environments ◽

Tree Planting ◽

Count Data Models ◽

Urban Tree ◽

Schima Superba ◽

Hurdle Models ◽

Positive Effects

Epiphytic bryophytes are known to perform essential ecosystem functions, but their sensitivity to environmental quality and change makes their survival and development vulnerable to global changes, especially habitat loss in urban environments. Fortunately, extensive urban tree planting programs worldwide have had a positive effect on the colonization and development of epiphytic bryophytes. However, how epiphytic bryophytes occur and grow on planted trees remain poorly known, especially in urban environments. In the present study, we surveyed the distribution of epiphytic bryophytes on tree trunks in a Schima superba Gardn. et Champ. urban plantation and then developed count data models, including tree characteristics, stand characteristics, human disturbance, terrain factors, and microclimate to predict the drivers on epiphytic bryophyte recruitment. Different counting models (Poisson, Negative binomial, Zero-inflated Poisson, Zero-inflated negative binomial, Hurdle-Poisson, Hurdle-negative binomial) were compared for a data analysis to account for the zero-inflated data structure. Our results show that (i) the shaded side and base of tree trunks were the preferred locations for bryophytes to colonize in urban plantations, (ii) both hurdle models performed well in modeling epiphytic bryophyte recruitment, and (iii) both hurdle models showed that the tree height, diameter at breast height (DBH), leaf area index (LAI), and altitude (ALT) promoted the occurrence of epiphytic bryophytes, but the height under branch and interference intensity of human activities opposed the occurrence of epiphytic bryophytes. Specifically, DBH and LAI had positive effects on the species richness recruitment count; similarly, DBH and ALT had positive effects on the abundance recruitment count, but slope had a negative effect. To promote the occurrence and growth of epiphytic bryophytes in urban tree planting programs, we suggest that managers regulate suitable habitats by cultivating and protecting large trees, promoting canopy closure, and controlling human disturbance.

Get full-text (via PubEx)