scholarly journals Penalized count data regression with application to hospital stay after pediatric cardiac surgery

2016 ◽  
Vol 25 (6) ◽  
pp. 2685-2703 ◽  
Author(s):  
Zhu Wang ◽  
Shuangge Ma ◽  
Michael Zappitelli ◽  
Chirag Parikh ◽  
Ching-Yun Wang ◽  
...  

Pediatric cardiac surgery may lead to poor outcomes such as acute kidney injury (AKI) and prolonged hospital length of stay (LOS). Plasma and urine biomarkers may help with early identification and prediction of these adverse clinical outcomes. In a recent multi-center study, 311 children undergoing cardiac surgery were enrolled to evaluate multiple biomarkers for diagnosis and prognosis of AKI and other clinical outcomes. LOS is often analyzed as count data, thus Poisson regression and negative binomial (NB) regression are common choices for developing predictive models. With many correlated prognostic factors and biomarkers, variable selection is an important step. The present paper proposes new variable selection methods for Poisson and NB regression. We evaluated regularized regression through penalized likelihood function. We first extend the elastic net (Enet) Poisson to two penalized Poisson regression: Mnet, a combination of minimax concave and ridge penalties; and Snet, a combination of smoothly clipped absolute deviation (SCAD) and ridge penalties. Furthermore, we extend the above methods to the penalized NB regression. For the Enet, Mnet, and Snet penalties (EMSnet), we develop a unified algorithm to estimate the parameters and conduct variable selection simultaneously. Simulation studies show that the proposed methods have advantages with highly correlated predictors, against some of the competing methods. Applying the proposed methods to the aforementioned data, it is discovered that early postoperative urine biomarkers including NGAL, IL18, and KIM-1 independently predict LOS, after adjusting for risk and biomarker variables.

CAUCHY ◽  
2016 ◽  
Vol 4 (3) ◽  
pp. 115
Author(s):  
Cindy Cahyaning Astuti ◽  
Angga Dwi Mulyanto

Regression analysis is used to determine relationship between one or several response variable (Y) with one or several predictor variables (X). Regression model between predictor variables and the Poisson distributed response variable is called Poisson Regression Model. Since, Poisson Regression requires an equality between mean and variance, it is not appropriate to apply this model on overdispersion (variance is higher than mean). Poisson regression model is commonly used to analyze the count data. On the count data type, it is often to encounteredd some observations that have zero value with large proportion of zero value on the response variable (zero Inflation). Poisson regression can be used to analyze count data but it has not been able to solve problem of excess zero value on the response variable. An alternative model which is more suitable for overdispersion data and can solve the problem of excess zero value on the response variable is Zero Inflated Negative Binomial (ZINB). In this research, ZINB is applied on the case of Tetanus Neonatorum in East Java. The aim of this research is to examine the likelihood function and to form an algorithm to estimate the parameter of ZINB and also applying ZINB model in the case of Tetanus Neonatorum in East Java. Maximum Likelihood Estimation (MLE) method is used to estimate the parameter on ZINB and the likelihood function is maximized using Expectation Maximization (EM) algorithm. Test results of ZINB regression model showed that the predictor variable have a partial significant effect at negative binomial model is the percentage of pregnant women visits and the percentage of maternal health personnel assisted, while the predictor variables that have a partial significant effect at zero inflation model is the percentage of neonatus visits.


2022 ◽  
Vol 10 (4) ◽  
pp. 488-498
Author(s):  
Yashmine Noor Islami ◽  
Dwi Ispriyanti ◽  
Puspita Kartikasari

Infant mortality (0-11 months) and maternal mortality (during pregnancy, childbirth, and postpartum) are significant indicators in determining the level of public health. Central Java Province which has 35 regencies/cities is included in the top five regions with the highest number of infant and maternal mortality in Indonesia. The data characteristics of the number of infants and maternal mortality are count data. Therefore, the Poisson Regression method can be used to analyze the factors that influence the number of infants and maternal mortality. In Poisson regression analysis, there must be a fulfilled assumption, called equidispersion. Frequently, the variance of count data is greater than the mean, which is known as the overdispersion. The research, binomial negative bivariate regression is used as a solutions to overcome the problem of overdispersion in poisson regression. This method produce a global model. In reality, the geographical, socio-cultural, and economic conditions of each region will be different. This illustrates the effect of spatial heterogeneity, so it needs to be developed into Geographically Weighted Negative Binomial Bivariate Regression (GWNBBR). The model of GWNBBR provides weighting based on the position or distance from one observation area to another. Significant variables for modeling infant mortality cases included the percentage of obstetric complications treated (X1), the percentage of infants who were exclusively breastfed (X3), and the percentage of poor people (X5). Significant variable for modeling maternal mortality cases is the percentage of poor people (X5). Based on the AIC value, GWNBBR model is better than binomial negatif bivariat regression model because it has a smaller AIC value. 


2021 ◽  
Vol 10 (3) ◽  
pp. 226-236
Author(s):  
Khusnul Khotimah ◽  
Itasia Dina Sulvianti ◽  
Pika Silvianti

The number of leper in West Java is an example of the count data case. The analyzes commonly used in count data is Poisson regression. This research will determine the variables that influence the number of leper in West Java. The data used is the number of leper in West Java in 2019. This data has an overdispersion condition and spatial heterogenity. To handle overdispersion, the negative binomial regression model can be employed. While spatial heterogenity is overcome by adding adaptive bisquare kernel weight. This research resulted Geographically Weighted Negative Binomial Regression (GWNBR) with a weighting adaptive bisquare kernel classifies regency/city in West Java into ten groups based on the variables that sigfinicantly influence the number of leper. In general, the variable in the percentage of households with Clean and Healthy Behavior (PHBS) has a significant effect in all regency/city in West Java. Especially for Bogor Regency, Depok City, Bogor City, and Pangandaran Regency, the variable of the percentage of people poverty does not have a significant effect on the number leper.


2019 ◽  
Vol 10 (6) ◽  
pp. 778-788 ◽  
Author(s):  
Joel Bierer ◽  
Roger Stanzel ◽  
Mark Henderson ◽  
Suvro Sett ◽  
David Horne

Introduction: The use of cardiopulmonary bypass in pediatric cardiac surgery is associated with significant inflammation, fluid overload, and end-organ dysfunction yielding morbidity and mortality. For decades, various intraoperative ultrafiltration techniques such as conventional ultrafiltration, modified ultrafiltration (MUF), zero-balance ultrafiltration (ZBUF), and combination techniques (ZBUF-MUF) have been used to mitigate these toxicities and promote improved postoperative outcomes. However, there is currently no consensus on the ultrafiltration technique or strategy that yields the most benefit for infants and children undergoing open heart surgery. Methods: A librarian-conducted PubMed literature search from 1990 to 2018 yielded 90 clinical studies or publications on the various forms of ultrafiltration and the impact on physiologic markers and clinical outcomes. All publications were reviewed, summarized, and conclusions synthesized. The data sets were not combined for systematic or meta-analysis due to significant heterogeneity in study protocols and patient populations. Results: Modified ultrafiltration significantly promotes improved myocardial function, reduction in fluid overload, and reduced bleeding and transfusion complications. Furthermore, ZBUF has shown a consistent reduction in inflammatory cytokines and improved pulmonary function and compliance. There is conflicting evidence that MUF, ZBUF, and ZBUF-MUF culminate in reduced ventilation time and intensive care unit stay. Conclusion: Various modes of ultrafiltration have been shown to be associated with improved physiologic function or clinical outcomes in pediatric cardiac surgery. There are some inconsistent trial results that can be explained by heterogeneity in ultrafiltration, clinical staff preferences, and institution protocols. Ultrafiltration has some essential benefit as it is ubiquitously used at pediatric heart centers; however, the optimal protocol could be yet identified.


2015 ◽  
Vol 26 (4) ◽  
pp. 1802-1823 ◽  
Author(s):  
Elizabeth H Payne ◽  
James W Hardin ◽  
Leonard E Egede ◽  
Viswanathan Ramakrishnan ◽  
Anbesaw Selassie ◽  
...  

Overdispersion is a common problem in count data. It can occur due to extra population-heterogeneity, omission of key predictors, and outliers. Unless properly handled, this can lead to invalid inference. Our goal is to assess the differential performance of methods for dealing with overdispersion from several sources. We considered six different approaches: unadjusted Poisson regression (Poisson), deviance-scale-adjusted Poisson regression (DS-Poisson), Pearson-scale-adjusted Poisson regression (PS-Poisson), negative-binomial regression (NB), and two generalized linear mixed models (GLMM) with random intercept, log-link and Poisson (Poisson-GLMM) and negative-binomial (NB-GLMM) distributions. To rank order the preference of the models, we used Akaike's information criteria/Bayesian information criteria values, standard error, and 95% confidence-interval coverage of the parameter values. To compare these methods, we used simulated count data with overdispersion of different magnitude from three different sources. Mean of the count response was associated with three predictors. Data from two real-case studies are also analyzed. The simulation results showed that NB and NB-GLMM were preferred for dealing with overdispersion resulting from any of the sources we considered. Poisson and DS-Poisson often produced smaller standard-error estimates than expected, while PS-Poisson conversely produced larger standard-error estimates. Thus, it is good practice to compare several model options to determine the best method of modeling count data.


2006 ◽  
Vol 132 (6) ◽  
pp. 1291-1298 ◽  
Author(s):  
Glyn D. Williams ◽  
Chandra Ramamoorthy ◽  
Larry Chu ◽  
Gregory B. Hammer ◽  
Komal Kamra ◽  
...  

2017 ◽  
Vol 18 (1) ◽  
pp. 3-23 ◽  
Author(s):  
Eva Cantoni ◽  
Marie Auda

When count data exhibit excess zero, that is more zero counts than a simpler parametric distribution can model, the zero-inflated Poisson (ZIP) or zero-inflated negative binomial (ZINB) models are often used. Variable selection for these models is even more challenging than for other regression situations because the availability of p covariates implies 4 p possible models. We adapt to zero-inflated models an approach for variable selection that avoids the screening of all possible models. This approach is based on a stochastic search through the space of all possible models, which generates a chain of interesting models. As an additional novelty, we propose three ways of extracting information from this rich chain and we compare them in two simulation studies, where we also contrast our approach with regularization (penalized) techniques available in the literature. The analysis of a typical dataset that has motivated our research is also presented, before concluding with some recommendations.


1996 ◽  
Vol 6 ◽  
pp. 155-173 ◽  
Author(s):  
Christopher H. Achen

The Generalized Event Count (GEC) estimator (King 1989a) is a statistical model for event counts. Its great attraction is that it provides a general likelihood function for count data, regardless of whether the data come from a Poisson, binomial, or negative binomial distribution. In consequence, it has been used in several recent statistical studies of event counts in the social sciences.Underlying the GEC, however, are unorthodox substantive assumptions about how the event counts have been generated (Amato, this volume). This paper gives some simple examples in which the GEC logic is clearly visible, and it shows how failures of the implicit assumptions can lead to erroneous GEC coefficient estimates and standard errors.


Sign in / Sign up

Export Citation Format

Share Document