The International Journal of Biostatistics
Latest Publications


TOTAL DOCUMENTS

442
(FIVE YEARS 91)

H-INDEX

31
(FIVE YEARS 4)

Published By Walter De Gruyter Gmbh

1557-4679, 2194-573x

2022 ◽  
Vol 0 (0) ◽  
Author(s):  
Asma Bahamyirou ◽  
Mireille E. Schnitzer ◽  
Edward H. Kennedy ◽  
Lucie Blais ◽  
Yi Yang

Abstract Effect modification occurs when the effect of a treatment on an outcome differsaccording to the level of some pre-treatment variable (the effect modifier). Assessing an effect modifier is not a straight-forward task even for a subject matter expert. In this paper, we propose a two-stageprocedure to automatically selecteffect modifying variables in a Marginal Structural Model (MSM) with a single time point exposure based on the two nuisance quantities (the conditionaloutcome expectation and propensity score). We highlight the performance of our proposal in a simulation study. Finally, to illustrate tractability of our proposed methods, we apply them to analyze a set of pregnancy data. We estimate the conditional expected difference in the counterfactual birth weight if all women were exposed to inhaled corticosteroids during pregnancy versus the counterfactual birthweight if all women were not, using data from asthma medications during pregnancy.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Alejandro Schuler ◽  
David Walsh ◽  
Diana Hall ◽  
Jon Walsh ◽  
Charles Fisher

Abstract Estimating causal effects from randomized experiments is central to clinical research. Reducing the statistical uncertainty in these analyses is an important objective for statisticians. Registries, prior trials, and health records constitute a growing compendium of historical data on patients under standard-of-care that may be exploitable to this end. However, most methods for historical borrowing achieve reductions in variance by sacrificing strict type-I error rate control. Here, we propose a use of historical data that exploits linear covariate adjustment to improve the efficiency of trial analyses without incurring bias. Specifically, we train a prognostic model on the historical data, then estimate the treatment effect using a linear regression while adjusting for the trial subjects’ predicted outcomes (their prognostic scores). We prove that, under certain conditions, this prognostic covariate adjustment procedure attains the minimum variance possible among a large class of estimators. When those conditions are not met, prognostic covariate adjustment is still more efficient than raw covariate adjustment and the gain in efficiency is proportional to a measure of the predictive accuracy of the prognostic model above and beyond the linear relationship with the raw covariates. We demonstrate the approach using simulations and a reanalysis of an Alzheimer’s disease clinical trial and observe meaningful reductions in mean-squared error and the estimated variance. Lastly, we provide a simplified formula for asymptotic variance that enables power calculations that account for these gains. Sample size reductions between 10% and 30% are attainable when using prognostic models that explain a clinically realistic percentage of the outcome variance.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Tristan Mary-Huard ◽  
Vittorio Perduca ◽  
Marie-Laure Martin-Magniette ◽  
Gilles Blanchard

Abstract In the context of finite mixture models one considers the problem of classifying as many observations as possible in the classes of interest while controlling the classification error rate in these same classes. Similar to what is done in the framework of statistical test theory, different type I and type II-like classification error rates can be defined, along with their associated optimal rules, where optimality is defined as minimizing type II error rate while controlling type I error rate at some nominal level. It is first shown that finding an optimal classification rule boils down to searching an optimal region in the observation space where to apply the classical Maximum A Posteriori (MAP) rule. Depending on the misclassification rate to be controlled, the shape of the optimal region is provided, along with a heuristic to compute the optimal classification rule in practice. In particular, a multiclass FDR-like optimal rule is defined and compared to the thresholded MAP rules that is used in most applications. It is shown on both simulated and real datasets that the FDR-like optimal rule may be significantly less conservative than the thresholded MAP rule.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Lola Étiévant ◽  
Vivian Viallon

Abstract Many causal models of interest in epidemiology involve longitudinal exposures, confounders and mediators. However, repeated measurements are not always available or used in practice, leading analysts to overlook the time-varying nature of exposures and work under over-simplified causal models. Our objective is to assess whether – and how – causal effects identified under such misspecified causal models relates to true causal effects of interest. We derive sufficient conditions ensuring that the quantities estimated in practice under over-simplified causal models can be expressed as weighted averages of longitudinal causal effects of interest. Unsurprisingly, these sufficient conditions are very restrictive, and our results state that the quantities estimated in practice should be interpreted with caution in general, as they usually do not relate to any longitudinal causal effect of interest. Our simulations further illustrate that the bias between the quantities estimated in practice and the weighted averages of longitudinal causal effects of interest can be substantial. Overall, our results confirm the need for repeated measurements to conduct proper analyses and/or the development of sensitivity analyses when they are not available.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Jun Ma ◽  
Dominique-Laurent Couturier ◽  
Stephane Heritier ◽  
Ian C. Marschner

Abstract This paper considers the problem of semi-parametric proportional hazards model fitting where observed survival times contain event times and also interval, left and right censoring times. Although this is not a new topic, many existing methods suffer from poor computational performance. In this paper, we adopt a more versatile penalized likelihood method to estimate the baseline hazard and the regression coefficients simultaneously. The baseline hazard is approximated using basis functions such as M-splines. A penalty is introduced to regularize the baseline hazard estimate and also to ease dependence of the estimates on the knots of the basis functions. We propose a Newton–MI (multiplicative iterative) algorithm to fit this model. We also present novel asymptotic properties of our estimates, allowing for the possibility that some parameters of the approximate baseline hazard may lie on the parameter space boundary. Comparisons of our method against other similar approaches are made through an intensive simulation study. Results demonstrate that our method is very stable and encounters virtually no numerical issues. A real data application involving melanoma recurrence is presented and an R package ‘survivalMPL’ implementing the method is available on R CRAN.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Hermine Biermé ◽  
Camille Constant ◽  
Anne Duittoz ◽  
Christine Georgelin

Abstract We present in this paper a global methodology for the spike detection in a biological context of fluorescence recording of GnRH-neurons calcium activity. For this purpose we first propose a simple stochastic model that could mimic experimental time series by considering an autoregressive AR(1) process with a linear trend and specific innovations involving spiking times. Estimators of parameters with asymptotic normality are established and used to set up a statistical test on estimated innovations in order to detect spikes. We compare several procedures and illustrate on biological data the performance of our procedure.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Yijun Wang ◽  
Weiwei Wang ◽  
Yincai Tang

Abstract The accelerated failure time mixture cure (AFTMC) model is widely used for survival data when a portion of patients can be cured. In this paper, a Bayesian semiparametric method is proposed to obtain the estimation of parameters and density distribution for both the cure probability and the survival distribution of the uncured patients in the AFTMC model. Specifically, the baseline error distribution of the uncured patients is nonparametrically modeled by a mixture of Dirichlet process. Based on the stick-breaking formulation of the Dirichlet process, the techniques of retrospective and slice sampling, an efficient and easy-to-implement Gibbs sampler is developed for the posterior calculation. The proposed approach can be easily implemented in commonly used statistical softwares, and its performance is comparable to fully parametric method via comprehensive simulation studies. Besides, the proposed approach is adopted to the analysis of a colorectal cancer clinical trial data.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Maria Iannario ◽  
Anna Clara Monti ◽  
Pietro Scalera

Abstract The choice of the number m of response categories is a crucial issue in categorization of a continuous response. The paper exploits the Proportional Odds Models’ property which allows to generate ordinal responses with a different number of categories from the same underlying variable. It investigates the asymptotic efficiency of the estimators of the regression coefficients and the accuracy of the derived inferential procedures when m varies. The analysis is based on models with closed-form information matrices so that the asymptotic efficiency can be analytically evaluated without need of simulations. The paper proves that a finer categorization augments the information content of the data and consequently shows that the asymptotic efficiency and the power of the tests on the regression coefficients increase with m. The impact of the loss of information produced by merging categories on the efficiency of the estimators is also considered, highlighting its risks especially when performed in its extreme form of dichotomization. Furthermore, the appropriate value of m for various sample sizes is explored, pointing out that a large number of categories can offset the limited amount of information of a small sample by a better quality of the data. Finally, two case studies on the quality of life of chemotherapy patients and on the perception of pain, based on discretized continuous scales, illustrate the main findings of the paper.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Joycelyne Ewusie ◽  
Joseph Beyene ◽  
Lehana Thabane ◽  
Sharon E. Straus ◽  
Jemila S. Hamid

Abstract Interrupted time series (ITS) design is commonly used to evaluate the impact of interventions in healthcare settings. Segmented regression (SR) is the most commonly used statistical method and has been shown to be useful in practical applications involving ITS designs. Nevertheless, SR is prone to aggregation bias, which leads to imprecision and loss of power to detect clinically meaningful differences. The objective of this article is to present a weighted SR method, where variability across patients within the healthcare facility and across time points is incorporated through weights. We present the methodological framework, provide optimal weights associated with data at each time point and discuss relevant statistical inference. We conduct extensive simulations to evaluate performance of our method and provide comparative analysis with the traditional SR using established performance criteria such as bias, mean square error and statistical power. Illustrations using real data is also provided. In most simulation scenarios considered, the weighted SR method produced estimators that are uniformly more precise and relatively less biased compared to the traditional SR. The weighted approach also associated with higher statistical power in the scenarios considered. The performance difference is much larger for data with high variability across patients within healthcare facilities. The weighted method proposed here allows us to account for the heterogeneity in the patient population, leading to increased accuracy and power across all scenarios. We recommend researchers to carefully design their studies and determine their sample size by incorporating heterogeneity in the patient population.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Pablo Martínez-Camblor ◽  
Todd A. MacKenzie ◽  
A. James O’Malley

Abstract Hazard ratios (HR) associated with the well-known proportional hazard Cox regression models are routinely used for measuring the impact of one factor of interest on a time-to-event outcome. However, if the underlying real model does not fit with the theoretical requirements, the interpretation of those HRs is not clear. We propose a new index, gHR, which generalizes the HR beyond the underlying survival model. We consider the case in which the study factor is a binary variable and we are interested in both the unadjusted and adjusted effect of this factor on a time-to-event variable, potentially, observed in a right-censored scenario. We propose non-parametric estimations for unadjusted gHR and semi-parametric regression-induced techniques for the adjusted case. The behavior of those estimators is studied in both large and finite sample situations. Monte Carlo simulations reveal that both estimators provide good approximations of their respective inferential targets. Data from the Health and Lifestyle Study are used for studying the relationship of the tobacco use and the age of death and illustrate the practical application of the proposed technique. gHR is a promising index which can help facilitate better understanding of the association of one study factor on a time-dependent outcome.


Sign in / Sign up

Export Citation Format

Share Document