Sample size calculation for recurrent event data with additive rates models

2021 ◽  
Author(s):  
Liang Zhu ◽  
Yimei Li ◽  
Yongqiang Tang ◽  
Liji Shen ◽  
Arzu Onar‐Thomas ◽  
...  
2013 ◽  
Vol 32 (30) ◽  
pp. 5448-5457 ◽  
Author(s):  
S. Schneider ◽  
H. Schmidli ◽  
T. Friede

2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Rhonda J. Rosychuk ◽  
Jeff W.N. Bachman ◽  
Anqi Chen ◽  
X. Joan Hu

Abstract Background Administrative databases offer vast amounts of data that provide opportunities for cost-effective insights. They simultaneously pose significant challenges to statistical analysis such as the redaction of data because of privacy policies and the provision of data that may not be at the level of detail required. For example, ages in years rather than birthdates available at event dates can pose challenges to the analysis of recurrent event data. Methods Hu and Rosychuk provided a strategy for estimating age-varying effects in a marginal regression analysis of recurrent event times when birthdates are all missing. They analyzed emergency department (ED) visits made by children and youth and privacy rules prevented all birthdates to be released, and justified their approach via a simulation and asymptotic study. With recent changes in data access rules, we requested a new extract of data for April 2010 to March 2017 that includes patient birthdates. This allows us to compare the estimates using the Hu and Rosychuk (HR) approach for coarsened ages with estimates under the true, known ages to further examine their approach numerically. The performance of the HR approach under five scenarios is considered: uniform distribution for missing birthdates, uniform distribution for missing birthdates with supplementary data on age, empirical distribution for missing birthdates, smaller sample size, and an additional year of data. Results Data from 33,299 subjects provided 58,166 ED visits. About 67% of subjects had one ED visit and less than 9% of subjects made over three visits during the study period. Most visits (84.0%) were made by teenagers between 13 and 17 years old. The uniform distribution and the HR modeling approach capture the main trends over age of the estimates when compared to the known birthdates. Boys had higher ED visit frequencies than girls in the younger ages whereas girls had higher ED visit frequencies than boys for the older ages. Including additional age data based on age at end of fiscal year did not sufficiently narrow the widths of potential birthdate intervals to influence estimates. The empirical distribution of the known birthdates was close to a uniform distribution and therefore, use of the empirical distribution did not change the estimates provided by assuming a uniform distribution for the missing birthdates. The HR approach performed well for a smaller sample size, although estimates were less smooth when there were very few ED visits at some younger ages. When an additional year of data is added, the estimates become better at these younger ages. Conclusions Overall the Hu and Rosychuk approach for coarsened ages performed well and captured the key features of the relationships between ED visit frequency and covariates.


Biometrics ◽  
2019 ◽  
Vol 76 (2) ◽  
pp. 448-459 ◽  
Author(s):  
Lili Wang ◽  
Kevin He ◽  
Douglas E. Schaubel

2017 ◽  
Author(s):  
◽  
Guanglei Yu

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT AUTHOR'S REQUEST.] Recurrent event data and panel count data are two common types of data that have been studied extensively in event history studies in literature. By recurrent event data, we mean that subjects are observed continuously in the follow-up study and thus occurrence times of recurrent events of interest are available. For panel count data, subjects are monitored periodically at discrete observation times and thus only numbers of recurrent events between two subsequent observations are recorded. In addition, one may face mixed panel count data in practice, which are the mixture of recurrent event data and panel count data. They arise when each study subject may be observed continuously during the whole study period, continuously over some study periods and at some time points otherwise, or only at some discrete time points. That is, these mixed data provide complete or incomplete information on the recurrent event process over different time periods for different subjects. It is well-known that in panel count data, the observation process may carry information on the underlying recurrent event process and the censoring may also be dependent in practice. Under such circumstance, the first part of this dissertation will discuss regression analysis of panel count data with informative observations and drop-outs. For the problem, a general means model is presented that can allow both additive and multiplicative effects of covariates on the underlying recurrent event process. In addition, the proportional rates model and the accelerated failure time model are employed to describe the covariate effects on the observation process and the dropout or follow-up process, respectively. For estimation of regression parameters, some estimating equation-based procedures are developed and the asymptotic properties of the proposed estimators are established. In addition, a resampling approach is proposed for the estimation of the covariance matrix of the proposed estimator and a model checking procedure is also provided. The results from an extensive simulation study indicate that the proposed methodology works well for practical situations and it is applied to a motivated set of real data from the Childhood Cancer Survivor Study (CCSS) given in Section 1.1.2.2. In the second part of this dissertation, we will consider regression analysis of mixed panel count data. One major problem in the statistical inference on the mixed data is to combine these two different types of data structures. Since panel count data can be viewed as interval-censored recurrent event data with exact occurrence times of events of interest unobserved or missing, they may be augmented by filling in those missing data by imputation. Then the mixed data can be converted to recurrent event data on which the existing statistical inference method can be easily implemented. Motivated by this, a multiple imputation-based estimation approach is proposed. A simulation study is conducted to study the finite-sample properties of the proposed methodology and it shows that the proposed method is more efficient than the existing method. Also, an illustrative example from the CCSS is provided. The third part of this dissertation still considers regression analysis of mixed panel count data but in the presence of a dependent terminal event, which precludes further occurrence of either recurrent events of interest or observations. For this problem, we present a marginal modeling approach which acknowledges the fact that there will be no more recurrent events after the terminal event and leaves the correlation structure unspecified. To estimate the parameters of interest, an estimating equation-based procedure is developed and the inverse probability of survival weighting technique is used. Asymptotic properties of proposed estimators are also established and finite-sample properties are assessed in a simulation study. We again apply this proposed methodology to the CCSS. In the last part of this dissertation, we will discuss some work directions of the future research.


Sign in / Sign up

Export Citation Format

Share Document