count response
Recently Published Documents


TOTAL DOCUMENTS

98
(FIVE YEARS 26)

H-INDEX

15
(FIVE YEARS 2)

Author(s):  
Osval Antonio Montesinos López ◽  
Abelardo Montesinos López ◽  
Jose Crossa

AbstractWe give a detailed description of random forest and exemplify its use with data from plant breeding and genomic selection. The motivations for using random forest in genomic-enabled prediction are explained. Then we describe the process of building decision trees, which are a key component for building random forest models. We give (1) the random forest algorithm, (2) the main hyperparameters that need to be tuned, and (3) different splitting rules that are key for implementing random forest models for continuous, binary, categorical, and count response variables. In addition, many examples are provided for training random forest models with different types of response variables with plant breeding data. The random forest algorithm for multivariate outcomes is provided and its most popular splitting rules are also explained. In this case, some examples are provided for illustrating its implementation even with mixed outcomes (continuous, binary, and categorical). Final comments about the pros and cons of random forest are provided.


Author(s):  
Hendrik van der Wurp ◽  
Andreas Groll

AbstractIn this work, we propose an extension of the versatile joint regression framework for bivariate count responses of the package by Marra and Radice (R package version 0.2-3, 2020) by incorporating an (adaptive) LASSO-type penalty. The underlying estimation algorithm is based on a quadratic approximation of the penalty. The method enables variable selection and the corresponding estimates guarantee shrinkage and sparsity. Hence, this approach is particularly useful in high-dimensional count response settings. The proposal’s empirical performance is investigated in a simulation study and an application on FIFA World Cup football data.


Author(s):  
Na Li ◽  
Nancy M. Heddle ◽  
Ishac Nazy ◽  
John G. Kelton ◽  
Donald M. Arnold

Fluctuations in platelet count levels over time may help distinguish immune thrombocytopenia (ITP) from other causes of thrombocytopenia. We derived the platelet variability (PVI) score to capture both the fluctuations in platelet count measurements and the severity of the thrombocytopenia over time. Raw PVI values, ranging from negative (less severe thrombocytopenia and/or low fluctuations) to positive (more severe thrombocytopenia and/or high fluctuations) were converted to an ordinal PVI score, from 0 - 6. We evaluated performance characteristics of the PVI score for consecutive adults with thrombocytopenia from the McMaster ITP Registry. We defined patients with definite ITP as those who achieved a platelet count response after treatment with intravenous immune globulin or high dose corticosteroids; and possible ITP as those who never received ITP treatment or did not respond to treatment. Of 841 thrombocytopenic patients, 104 had definite ITP, 398 had possible ITP, and 339 had non-ITP thrombocytopenia. The median PVI score was 5 (interquartile range [IQR] 5, 6) for definite ITP; 3 (1, 5) for possible ITP; and 0 (0, 2) for non-ITP. A high PVI score correlated with the diagnosis of definite ITP even when calculated at the patient's initial assessment, before any treatment had been administered. Platelet count fluctuations alone contributed to the specificity of the overall PVI score. The PVI score may help clinicians diagnose ITP among patients with thrombocytopenia.


2021 ◽  
Vol 13 (2) ◽  
pp. 521-536
Author(s):  
T. Gokul ◽  
M. R. Srinivasan

Joint modeling in longitudinal data is an interesting area of research since it predicts the outcome with covariates that are measured repeatedly over the time. However, there is no proper methodology available in literature to incorporate the joint modeling approach for count-count response data. In addition, there are several situations where longitudinal data might not be possible to collect the complete data and the Missingness may occur due to the absence of the subjects at the follow-up. In this paper, joint modelling for longitudinal count data is adopted using Bayesian Generalized Linear Mixed Model framework to understand the association between the variables. Further, an imputation method is used to handle the missing entries in the data and the efficiency of the methodology has been studied using Markov Chain Monte-Carlo (MCMC) technique. An application to the proposed methodology has been discussed and identified the suitable nutritional supplements in Bayesian perspective without eliminating the missing entries in the dataset.


2021 ◽  
Vol 5 (8) ◽  
pp. 2137-2141
Author(s):  
Flora Peyvandi ◽  
Spero Cataland ◽  
Marie Scully ◽  
Paul Coppo ◽  
Paul Knoebl ◽  
...  

Abstract The efficacy and safety of caplacizumab in individuals with acquired thrombotic thrombocytopenic purpura (aTTP) have been established in the phase 2 TITAN and phase 3 HERCULES trials. Integrated analysis of data from both trials was conducted to increase statistical power for assessing treatment differences in efficacy and safety outcomes. Caplacizumab was associated with a significant reduction in the number of deaths (0 vs 4; P < .05) and a significantly lower incidence of refractory TTP (0 vs 8; P < .05) vs placebo during the treatment period. Consistent with the individual trials, treatment with caplacizumab resulted in a faster time to platelet count response (hazard ratio, 1.65; P < .001), a 72.6% reduction in the proportion of patients with the composite end point of TTP-related death, TTP exacerbation, or occurrence of at least 1 treatment-emergent major thromboembolic event during the treatment period (13.0% vs 47.3%; P < .001), and a 33.3% reduction in the median number of therapeutic plasma exchange days (5.0 vs 7.5 days) vs placebo. No new safety signals were identified; mild mucocutaneous bleeding was the main safety finding. This integrated analysis provided new evidence that caplacizumab prevents mortality and refractory disease in acquired TTP and strengthened individual trial findings, with a confirmed favorable safety and tolerability profile. These trials were registered at www.clinicaltrials.gov as #NCT01151423 and #NCT02553317.


2021 ◽  
pp. 096228022110028
Author(s):  
T Baghfalaki ◽  
M Ganjali

Joint modeling of zero-inflated count and time-to-event data is usually performed by applying the shared random effect model. This kind of joint modeling can be considered as a latent Gaussian model. In this paper, the approach of integrated nested Laplace approximation (INLA) is used to perform approximate Bayesian approach for the joint modeling. We propose a zero-inflated hurdle model under Poisson or negative binomial distributional assumption as sub-model for count data. Also, a Weibull model is used as survival time sub-model. In addition to the usual joint linear model, a joint partially linear model is also considered to take into account the non-linear effect of time on the longitudinal count response. The performance of the method is investigated using some simulation studies and its achievement is compared with the usual approach via the Bayesian paradigm of Monte Carlo Markov Chain (MCMC). Also, we apply the proposed method to analyze two real data sets. The first one is the data about a longitudinal study of pregnancy and the second one is a data set obtained of a HIV study.


2021 ◽  
Vol 69 (1) ◽  
Author(s):  
Mahmoud M. Hodeib ◽  
Ahmed G. Ali ◽  
Nsreen M. Kamel ◽  
Shaimaa A. Senosy ◽  
Ehab M. Fahmy ◽  
...  

Abstract Background Although some investigators have confirmed the association between H. pylori and chronic ITP in adults, studies in pediatric patients are still few and have produced conflicting results. The study was carried out to detect the prevalence of H. pylori among chronic ITP children and to investigate the impact of treatment of H. pylori infection on platelet count response. Results The prevalence of H. pylori in chronic ITP children was 63%. The platelet count was statistically significantly higher among H. pylori stool antigen (HpSA)-negative children. A significant difference was reported in which platelet count increased from 70.55 ± 4.788 million/μL before H. pylori eradication therapy to 110.78 ± 15.128 million/μL after therapy. Conclusion We concluded that H. pylori eradication therapy was effective in increasing platelet count in H. pylori-positive chronic ITP patients.


2021 ◽  
Author(s):  
Aragaw Eshetie Aguade ◽  
B.Muniswamy Begari

Abstract BackgroundThe Poisson regression model is useful for analyse count data, but, when the observations are correlated the Poisson estimate will be biased. Whereas, when the over-dispersion and heterogeneity problems occur the imposition of the Poisson model underestimate the standard error and overestimate the significance of the regression parameters. Therefore, the objective of this paper was to develop a test statistic to model and predict clustered count response data via the application and simulation data.MethodsThis paper concentrated on the clustered count data model to take into account heterogeneity. Accordingly, we developed a score test based on the multilevel Poisson model for testing heterogeneity with the alternative Poisson regression model. In addition, for the model application, we used the EDHS children`s data. Therefore, to evaluate the proposed model, we used both simulation and application data.ResultsSimulation results showed that the proposed score test has high power to predict and used to control heterogeneity between groups. Oromia, Amhara, and SNNPR are among the regions with the highest child mortality rates (Table 1). The results indicated that women who made marriage a mean age of 16 years and gave birth to the first child a mean age of 18 years and 8 months. Table 1 showed that 81% of all child deaths have recorded in rural areas. 78% of child families were illiterate, as a result, 75% of children don't have access to latrines and drinking water. Rivers and open-source waters are the common sources of drinking water, which comprised 79% of the total water supply. Therefore, from the research finding, it is possible to conclude that most child mortality is due to scarcity of water.ConclusionThe Power of test estimates indicated that the proposed method was better than the existing models. All covariant and dummy explanatory variables have a significant effect on the deaths of children. Hence, the multilevel Poisson model results indicated that there exists high variability among regions for the deaths of children. Therefore, this work suggested that the applications of the random-effects model provided a simple and robust means to predict the count response data model.


2021 ◽  
Author(s):  
Oladimeji Mudele ◽  
Alejandro Frery ◽  
Lucas FR Zanandrez ◽  
Alvaro E Eiras ◽  
Paolo Gamba

Mosquitoes propagate many human diseases, some widespread and with no vaccines. The Ae. aegypti mosquito vector transmits Zika, Chikungunya, and Dengue viruses. Effective public health interventions to control the spread of these diseases and protect the population require models that explain the core environmental drivers of the vector population. Field campaigns are expensive, and data from meteorological sites that feed models with the required environmental data often lack detail. As a consequence, we explore temporal modeling of the population of Ae. aegypti mosquito vector species and environmental conditions- temperature, moisture, precipitation, and vegetation- have been shown to have significant effects. We use earth observation (EO) data as our source for estimating these biotic and abiotic environmental variables based on proxy features, namely: Normalized difference vegetation index, Normalized difference water index, Precipitation, and Land surface temperature. We obtained our response variable from field-collected mosquito population measured weekly using 791 mosquito traps in Vila Velha city, Brazil, for 36 weeks in 2017, and 40 weeks in 2018. Recent similar studies have used machine learning (ML) techniques for this task. However, these techniques are neither intuitive nor explainable from an operational point of view. As a result, we use a Generalized Linear Model (GLM) to model this relationship due to its fitness for count response variable modeling, its interpretability, and the ability to visualize the confidence intervals for all inferences. Also, to improve our model, we use the Akaike Information Criterion to select the most informative environmental features. Finally, we show how to improve the quality of the model by weighting our GLM. Our resulting weighted GLM compares well in quality with ML techniques: Random Forest and Support Vector Machines. These results provide an advancement with regards to qualitative and explainable epidemiological risk modeling in urban environments.


2021 ◽  
Author(s):  
Oladimeji Mudele ◽  
Alejandro Frery ◽  
Lucas FR Zanandrez ◽  
Alvaro E Eiras ◽  
Paolo Gamba

Mosquitoes propagate many human diseases, some widespread and with no vaccines. The Ae. aegypti mosquito vector transmits Zika, Chikungunya, and Dengue viruses. Effective public health interventions to control the spread of these diseases and protect the population require models that explain the core environmental drivers of the vector population. Field campaigns are expensive, and data from meteorological sites that feed models with the required environmental data often lack detail. As a consequence, we explore temporal modeling of the population of Ae. aegypti mosquito vector species and environmental conditions- temperature, moisture, precipitation, and vegetation- have been shown to have significant effects. We use earth observation (EO) data as our source for estimating these biotic and abiotic environmental variables based on proxy features, namely: Normalized difference vegetation index, Normalized difference water index, Precipitation, and Land surface temperature. We obtained our response variable from field-collected mosquito population measured weekly using 791 mosquito traps in Vila Velha city, Brazil, for 36 weeks in 2017, and 40 weeks in 2018. Recent similar studies have used machine learning (ML) techniques for this task. However, these techniques are neither intuitive nor explainable from an operational point of view. As a result, we use a Generalized Linear Model (GLM) to model this relationship due to its fitness for count response variable modeling, its interpretability, and the ability to visualize the confidence intervals for all inferences. Also, to improve our model, we use the Akaike Information Criterion to select the most informative environmental features. Finally, we show how to improve the quality of the model by weighting our GLM. Our resulting weighted GLM compares well in quality with ML techniques: Random Forest and Support Vector Machines. These results provide an advancement with regards to qualitative and explainable epidemiological risk modeling in urban environments.


Sign in / Sign up

Export Citation Format

Share Document