Concentration inequalities for the empirical distribution of discrete distributions: beyond the method of types

2019 ◽  
Vol 9 (4) ◽  
pp. 813-850 ◽  
Author(s):  
Jay Mardia ◽  
Jiantao Jiao ◽  
Ervin Tánczos ◽  
Robert D Nowak ◽  
Tsachy Weissman

Abstract We study concentration inequalities for the Kullback–Leibler (KL) divergence between the empirical distribution and the true distribution. Applying a recursion technique, we improve over the method of types bound uniformly in all regimes of sample size $n$ and alphabet size $k$, and the improvement becomes more significant when $k$ is large. We discuss the applications of our results in obtaining tighter concentration inequalities for $L_1$ deviations of the empirical distribution from the true distribution, and the difference between concentration around the expectation or zero. We also obtain asymptotically tight bounds on the variance of the KL divergence between the empirical and true distribution, and demonstrate their quantitatively different behaviours between small and large sample sizes compared to the alphabet size.

2012 ◽  
Vol 9 (5) ◽  
pp. 561-569 ◽  
Author(s):  
KK Gordan Lan ◽  
Janet T Wittes

Background Traditional calculations of sample size do not formally incorporate uncertainty about the likely effect size. Use of a normal prior to express that uncertainty, as recently recommended, can lead to power that does not approach 1 as the sample size approaches infinity. Purpose To provide approaches for calculating sample size and power that formally incorporate uncertainty about effect size. The relevant formulas should ensure that power approaches one as sample size increases indefinitely and should be easy to calculate. Methods We examine normal, truncated normal, and gamma priors for effect size computationally and demonstrate analytically an approach to approximating the power for a truncated normal prior. We also propose a simple compromise method that requires a moderately larger sample size than the one derived from the fixed effect method. Results Use of a realistic prior distribution instead of a fixed treatment effect is likely to increase the sample size required for a Phase 3 trial. The standard fixed effect method for moving from estimates of effect size obtained in a Phase 2 trial to the sample size of a Phase 3 trial ignores the variability inherent in the estimate from Phase 2. Truncated normal priors appear to require unrealistically large sample sizes while gamma priors appear to place too much probability on large effect sizes and therefore produce unrealistically high power. Limitations The article deals with a few examples and a limited range of parameters. It does not deal explicitly with binary or time-to-failure data. Conclusions Use of the standard fixed approach to sample size calculation often yields a sample size leading to lower power than desired. Other natural parametric priors lead either to unacceptably large sample sizes or to unrealistically high power. We recommend an approach that is a compromise between assuming a fixed effect size and assigning a normal prior to the effect size.


2018 ◽  
Vol 15 (5) ◽  
pp. 499-508 ◽  
Author(s):  
Isabelle R Weir ◽  
Ludovic Trinquart

Background/aims Non-inferiority trials with time-to-event outcomes are becoming increasingly common. Designing non-inferiority trials is challenging, in particular, they require very large sample sizes. We hypothesized that the difference in restricted mean survival time, an alternative to the hazard ratio, could lead to smaller required sample sizes. Methods We show how to convert a margin for the hazard ratio into a margin for the difference in restricted mean survival time and how to calculate the required sample size under a Weibull survival distribution. We systematically selected non-inferiority trials published between 2013 and 2016 in seven major journals. Based on the protocol and article of each trial, we determined the clinically relevant time horizon of interest. We reconstructed individual patient data for the primary outcome and fit a Weibull distribution to the comparator arm. We converted the margin for the hazard ratio into the margin for the difference in restricted mean survival time. We tested for non-inferiority using the difference in restricted mean survival time and hazard ratio. We determined the required sample size based on both measures, using the type I error risk and power from the original trial design. Results We included 35 trials. We found evidence of non-proportional hazards in five (14%) trials. The hazard ratio and the difference in restricted mean survival time were consistent regarding non-inferiority testing, except in one trial where the difference in restricted mean survival time led to evidence of non-inferiority while the hazard ratio did not. The median hazard ratio margin was 1.43 (Q1–Q3, 1.29–1.75). The median of the corresponding margins for the difference in restricted mean survival time was −21 days (Q1–Q3, −36 to −8) for a median time horizon of 2.0 years (Q1–Q3, 1–3 years). The required sample size according to the difference in restricted mean survival time was smaller in 71% of trials, with a median relative decrease of 8.5% (Q1–Q3, 0.4%–38.0%). Across all 35 trials, about 25,000 participants would have been spared from enrollment using the difference in restricted mean survival time compared to hazard ratio for trial design. Conclusion The margins for the hazard ratio may seem large but translate to relatively small differences in restricted mean survival time. The difference in restricted mean survival time offers meaningful interpretation and can result in considerable reductions in sample size. Restricted mean survival time-based measures should be considered more widely in the design and analysis of non-inferiority trials with time-to-event outcomes.


2021 ◽  
Vol 2 (4) ◽  
Author(s):  
R Mukherjee ◽  
N Muehlemann ◽  
A Bhingare ◽  
G W Stone ◽  
C Mehta

Abstract Background Cardiovascular trials increasingly require large sample sizes and long follow-up periods. Several approaches have been developed to optimize sample size such as adaptive group sequential trials, samples size re-estimation based on the promising zone, and the win ratio. Traditionally, the log-rank or the Cox proportional hazards model is used to test for treatment effects, based on a constant hazard rate and proportional hazards alternatives, which however, may not always hold. Large sample sizes and/or long follow up periods are especially challenging for trials evaluating the efficacy of acute care interventions. Purpose We propose an adaptive design wherein using interim data, Bayesian computation of predictive power guides the increase in sample size and/or the minimum follow-up duration. These computations do not depend on the constant hazard rate and proportional hazards assumptions, thus yielding more robust interim decision making for the future course of the trial. Methods PROTECT IV is designed to evaluate mechanical circulatory support with the Impella CP device vs. standard of care during high-risk PCI. The primary endpoint is a composite of all-cause death, stroke, MI or hospitalization for cardiovascular causes with initial minimum follow-up of 12 months and initial enrolment of 1252 patients with expected recruitment in 24 months. The study will employ an adaptive increase in sample size and/or minimum follow-up at the Interim analysis when ∼80% of patients have been enrolled. The adaptations utilize extensive simulations to choose a new sample size up to 2500 and new minimal follow-up time up to 36 months that provides a Bayesian predictive power of 85%. Bayesian calculations are based on patient-level information rather than summary statistics therefore enabling more reliable interim decisions. Constant or proportional hazard assumptions are not required for this approach because two separate Piece-wise Constant Hazard Models with Gamma-priors are fitted to the interim data. Bayesian predictive power is then calculated using Monte-Carlo methodology. Via extensive simulations, we have examined the utility of the proposed design for situations with time varying hazards and non-proportional hazards ratio such as situations of delayed treatment effect (Figure) and crossing of survival curves. The heat map of Bayesian predictive power obtained when the interim Kaplan-Meier curves reflected delayed response shows that for this scenario an optimal combination of increased sample size and increased follow-up time would be needed to attain 85% predictive power. Conclusion A proposed adaptive design with sample size and minimum follow-up period adaptation based on Bayesian predictive power at interim looks allows for de-risking the trial of uncertainties regarding effect size in terms of control arm outcome rate, hazard ratio, and recruitment rate. Funding Acknowledgement Type of funding sources: Private company. Main funding source(s): Abiomed, Inc Figure 1


2018 ◽  
Vol 13 (4) ◽  
pp. 403-408 ◽  
Author(s):  
Jeff Bodington ◽  
Manuel Malfeito-Ferreira

AbstractMuch research shows that women and men have different taste acuities and preferences. If female and male judges tend to assign different ratings to the same wines, then the gender balances of the judge panels will bias awards. Existing research supports the null hypothesis, however, that finding is based on small sample sizes. This article presents the results for a large sample; 260 wines and 1,736 wine-score observations. Subject to the strong qualification that non-gender-related variation is material, the results affirm that female and male judges do assign about the same ratings to the same wines. The expected value of the difference in their mean ratings is zero. (JEL Classifications: A10, C00, C10, C12, D12)


Author(s):  
J.E. Radcliffe

Production of pastures measured by the trim and difference cutting techniques are presented for sites on North Island hill pastures and South Island improved tussock grasslands. On North Island sites the trim technique consistently gave higher yields. On South Island sites, interim results have shown no consistent overall effect, although large differences in yields have been measured at some cuts. The number of samples required by the trim lechnique to give a standard error + 10% of mean yield was 30 on North Island sites (sample size 0.3 m x 0.2 m) and 15 on South Island sites (sample size 0.5 m x 0.5 m). The difference technique was much more variable and required 5 and 20 times more samples on North and South Island sites, respectively, to give a standard error +- 10% of the mean yield. On another North Island site, large, laxly trimmed sampling sites gave higher yields than smaller, severely trimmed sampling sites and one large sample (3.4 m x 1 .O m) generally gave similar precision in yields to 6 or 7 smaller samples (0.8 m x 0.3 m).


2018 ◽  
Vol 156 (4) ◽  
pp. 725-744 ◽  
Author(s):  
JUDY A. MASSARE ◽  
DEAN R. LOMAX

AbstractThe abundance of specimens of Ichthyosaurus provides an opportunity to assess morphological variation without the limits of a small sample size. This research evaluates the variation and taxonomic utility of hindfin morphology. Two seemingly distinct morphotypes of the mesopodium occur in the genus. Morphotype 1 has three elements in the third row: metatarsal two, distal tarsal three and distal tarsal four. This is the common morphology in Ichthyosaurus breviceps, I. conybeari and I. somersetensis. Morphotype 2 has four elements in the third row, owing to a bifurcation. This morphotype occurs in at least some specimens of each species, but it has several variations distinguished by the extent of contact of elements in the third row with the astragalus. Two specimens display a different morphotype in each fin, suggesting that the difference reflects individual variation. In Ichthyosaurus, the hindfin is taxonomically useful at the genus level, but species cannot be identified unequivocally from a well-preserved hindfin, although certain morphologies are more common in certain species than others. The large sample size filled in morphological gaps between what initially appeared to be taxonomically distinct characters. The full picture of variation would have been obscured with a small sample size. Furthermore, we have found several unusual morphologies which, in isolation, could have been mistaken for new taxa. Thus, one must be cautious when describing new species or genera on the basis of limited material, such as isolated fins and fragmentary specimens.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Estibaliz Gómez-de-Mariscal ◽  
Vanesa Guerrero ◽  
Alexandra Sneider ◽  
Hasini Jayatilaka ◽  
Jude M. Phillip ◽  
...  

AbstractBiomedical research has come to rely on p-values as a deterministic measure for data-driven decision-making. In the largely extended null hypothesis significance testing for identifying statistically significant differences among groups of observations, a single p-value is computed from sample data. Then, it is routinely compared with a threshold, commonly set to 0.05, to assess the evidence against the hypothesis of having non-significant differences among groups, or the null hypothesis. Because the estimated p-value tends to decrease when the sample size is increased, applying this methodology to datasets with large sample sizes results in the rejection of the null hypothesis, making it not meaningful in this specific situation. We propose a new approach to detect differences based on the dependence of the p-value on the sample size. We introduce new descriptive parameters that overcome the effect of the size in the p-value interpretation in the framework of datasets with large sample sizes, reducing the uncertainty in the decision about the existence of biological differences between the compared experiments. The methodology enables the graphical and quantitative characterization of the differences between the compared experiments guiding the researchers in the decision process. An in-depth study of the methodology is carried out on simulated and experimental data. Code availability at https://github.com/BIIG-UC3M/pMoSS.


2018 ◽  
Author(s):  
Sigit Haryadi

We cannot be sure exactly what will happen, we can only estimate by using a particular method, where each method must have the formula to create a regression equation and a formula to calculate the confidence level of the estimated value. This paper conveys a method of estimating the future values, in which the formula for creating a regression equation is based on the assumption that the future value will depend on the difference of the past values divided by a weight factor which corresponding to the time span to the present, and the formula for calculating the level of confidence is to use "the Haryadi Index". The advantage of this method is to remain accurate regardless of the sample size and may ignore the past value that is considered irrelevant.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Rhonda J. Rosychuk ◽  
Jeff W.N. Bachman ◽  
Anqi Chen ◽  
X. Joan Hu

Abstract Background Administrative databases offer vast amounts of data that provide opportunities for cost-effective insights. They simultaneously pose significant challenges to statistical analysis such as the redaction of data because of privacy policies and the provision of data that may not be at the level of detail required. For example, ages in years rather than birthdates available at event dates can pose challenges to the analysis of recurrent event data. Methods Hu and Rosychuk provided a strategy for estimating age-varying effects in a marginal regression analysis of recurrent event times when birthdates are all missing. They analyzed emergency department (ED) visits made by children and youth and privacy rules prevented all birthdates to be released, and justified their approach via a simulation and asymptotic study. With recent changes in data access rules, we requested a new extract of data for April 2010 to March 2017 that includes patient birthdates. This allows us to compare the estimates using the Hu and Rosychuk (HR) approach for coarsened ages with estimates under the true, known ages to further examine their approach numerically. The performance of the HR approach under five scenarios is considered: uniform distribution for missing birthdates, uniform distribution for missing birthdates with supplementary data on age, empirical distribution for missing birthdates, smaller sample size, and an additional year of data. Results Data from 33,299 subjects provided 58,166 ED visits. About 67% of subjects had one ED visit and less than 9% of subjects made over three visits during the study period. Most visits (84.0%) were made by teenagers between 13 and 17 years old. The uniform distribution and the HR modeling approach capture the main trends over age of the estimates when compared to the known birthdates. Boys had higher ED visit frequencies than girls in the younger ages whereas girls had higher ED visit frequencies than boys for the older ages. Including additional age data based on age at end of fiscal year did not sufficiently narrow the widths of potential birthdate intervals to influence estimates. The empirical distribution of the known birthdates was close to a uniform distribution and therefore, use of the empirical distribution did not change the estimates provided by assuming a uniform distribution for the missing birthdates. The HR approach performed well for a smaller sample size, although estimates were less smooth when there were very few ED visits at some younger ages. When an additional year of data is added, the estimates become better at these younger ages. Conclusions Overall the Hu and Rosychuk approach for coarsened ages performed well and captured the key features of the relationships between ED visit frequency and covariates.


2021 ◽  
Vol 15 (1) ◽  
Author(s):  
Weitong Cui ◽  
Huaru Xue ◽  
Lei Wei ◽  
Jinghua Jin ◽  
Xuewen Tian ◽  
...  

Abstract Background RNA sequencing (RNA-Seq) has been widely applied in oncology for monitoring transcriptome changes. However, the emerging problem that high variation of gene expression levels caused by tumor heterogeneity may affect the reproducibility of differential expression (DE) results has rarely been studied. Here, we investigated the reproducibility of DE results for any given number of biological replicates between 3 and 24 and explored why a great many differentially expressed genes (DEGs) were not reproducible. Results Our findings demonstrate that poor reproducibility of DE results exists not only for small sample sizes, but also for relatively large sample sizes. Quite a few of the DEGs detected are specific to the samples in use, rather than genuinely differentially expressed under different conditions. Poor reproducibility of DE results is mainly caused by high variation of gene expression levels for the same gene in different samples. Even though biological variation may account for much of the high variation of gene expression levels, the effect of outlier count data also needs to be treated seriously, as outlier data severely interfere with DE analysis. Conclusions High heterogeneity exists not only in tumor tissue samples of each cancer type studied, but also in normal samples. High heterogeneity leads to poor reproducibility of DEGs, undermining generalization of differential expression results. Therefore, it is necessary to use large sample sizes (at least 10 if possible) in RNA-Seq experimental designs to reduce the impact of biological variability and DE results should be interpreted cautiously unless soundly validated.


Sign in / Sign up

Export Citation Format

Share Document