Cornish-Fisher Expansions for Functionals of the Weighted Partial Sum Empirical Distribution

Author(s):  
Christopher S. Withers ◽  
Saralees Nadarajah
2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Rhonda J. Rosychuk ◽  
Jeff W.N. Bachman ◽  
Anqi Chen ◽  
X. Joan Hu

Abstract Background Administrative databases offer vast amounts of data that provide opportunities for cost-effective insights. They simultaneously pose significant challenges to statistical analysis such as the redaction of data because of privacy policies and the provision of data that may not be at the level of detail required. For example, ages in years rather than birthdates available at event dates can pose challenges to the analysis of recurrent event data. Methods Hu and Rosychuk provided a strategy for estimating age-varying effects in a marginal regression analysis of recurrent event times when birthdates are all missing. They analyzed emergency department (ED) visits made by children and youth and privacy rules prevented all birthdates to be released, and justified their approach via a simulation and asymptotic study. With recent changes in data access rules, we requested a new extract of data for April 2010 to March 2017 that includes patient birthdates. This allows us to compare the estimates using the Hu and Rosychuk (HR) approach for coarsened ages with estimates under the true, known ages to further examine their approach numerically. The performance of the HR approach under five scenarios is considered: uniform distribution for missing birthdates, uniform distribution for missing birthdates with supplementary data on age, empirical distribution for missing birthdates, smaller sample size, and an additional year of data. Results Data from 33,299 subjects provided 58,166 ED visits. About 67% of subjects had one ED visit and less than 9% of subjects made over three visits during the study period. Most visits (84.0%) were made by teenagers between 13 and 17 years old. The uniform distribution and the HR modeling approach capture the main trends over age of the estimates when compared to the known birthdates. Boys had higher ED visit frequencies than girls in the younger ages whereas girls had higher ED visit frequencies than boys for the older ages. Including additional age data based on age at end of fiscal year did not sufficiently narrow the widths of potential birthdate intervals to influence estimates. The empirical distribution of the known birthdates was close to a uniform distribution and therefore, use of the empirical distribution did not change the estimates provided by assuming a uniform distribution for the missing birthdates. The HR approach performed well for a smaller sample size, although estimates were less smooth when there were very few ED visits at some younger ages. When an additional year of data is added, the estimates become better at these younger ages. Conclusions Overall the Hu and Rosychuk approach for coarsened ages performed well and captured the key features of the relationships between ED visit frequency and covariates.


2021 ◽  
Vol 13 (14) ◽  
pp. 2668
Author(s):  
Tamás Telbisz

Conical hills, or residual hills, are frequently mentioned landforms in the context of humid tropical karsts as they are dominant surface elements there. Residual hills are also present in temperate karsts, but generally in a less remarkable way. These landforms have not been thoroughly addressed in the literature to date, therefore the present article is the first attempt to morphometrically characterize temperate zone residual karst hills. We use the methods already developed for doline morphometry, and we apply them to the “inverse” topography using LiDAR-based digital terrain models (DTMs) of three Slovenian sample areas. The characteristics of hills and depressions are analysed in parallel, taking into account the rank of the forms. A common feature of hills and dolines is that, for both types, the empirical distribution of planform areas has a strongly positive skew. After logarithmic transformation, these distributions can be approximated by Inverse Gaussian, Normal, and Weibull distributions. Along with the rank, the planform area and vertical extent of the hills and dolines increase similarly. High circularity is characteristic only of the first-rank forms for both dolines and hills. For the sample areas, the the hill area ratios and the doline area ratios have similar values, but the total extent of the hills is slightly larger in each case. A difference between dolines and hills is that the shapes of hills are more similar to one another than those of dolines. The reason for this is that the larger, closed depressions are created by lateral coalescence, while the hills are residual forms carved from large blocks. Another significant difference is that the density of dolines is much higher than that of hills. This article is intended as a methodological starting point for a new topic, aiming at the comprehensive study of residual karst hills across different climatic areas.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 675
Author(s):  
Xuze Zhang ◽  
Saumyadipta Pyne ◽  
Benjamin Kedem

In disease modeling, a key statistical problem is the estimation of lower and upper tail probabilities of health events from given data sets of small size and limited range. Assuming such constraints, we describe a computational framework for the systematic fusion of observations from multiple sources to compute tail probabilities that could not be obtained otherwise due to a lack of lower or upper tail data. The estimation of multivariate lower and upper tail probabilities from a given small reference data set that lacks complete information about such tail data is addressed in terms of pertussis case count data. Fusion of data from multiple sources in conjunction with the density ratio model is used to give probability estimates that are non-obtainable from the empirical distribution. Based on a density ratio model with variable tilts, we first present a univariate fit and, subsequently, improve it with a multivariate extension. In the multivariate analysis, we selected the best model in terms of the Akaike Information Criterion (AIC). Regional prediction, in Washington state, of the number of pertussis cases is approached by providing joint probabilities using fused data from several relatively small samples following the selected density ratio model. The model is validated by a graphical goodness-of-fit plot comparing the estimated reference distribution obtained from the fused data with that of the empirical distribution obtained from the reference sample only.


Sign in / Sign up

Export Citation Format

Share Document