Non-stationarity in annual and seasonal series of peak flow and precipitation  in the UK

Abstract. When designing or maintaining an hydraulic structure, an estimate of the frequency and magnitude of extreme events is required. The most common methods to obtain such estimates rely on the assumption of stationarity, i.e. the assumption that the process under study is not changing. The public perception and worry of a changing climate have led to a wide debate on the validity of this assumption. In this work trends for annual and seasonal maxima in peak river flow and catchment-average daily rainfall are explored. Assuming a 2-parameters log-normal distribution, a linear regression model is applied, allowing the mean of the distribution to vary with time. For the river flow data, the linear model is extended to include an additional variable, the 99th percentile of the daily rainfall for a year. From the fitted models, dimensionless magnification factors are estimated and plotted on a map, shedding light on whether or not geographical coherence can be found in the significant changes. The implications of the identified trends from a decision making perspective are then discussed, in particular with regard to the Type I and Type II error probabilities. One striking feature of the estimated trends is that the high variability found in the data leads to very inconclusive test results. Indeed, for most stations it is impossible to make a statement regarding whether or not the current design standards for the 2085 horizon can be considered precautionary. The power of tests on trends is further discussed in the light of statistical power analysis and sample size calculations.

Download Full-text

Non-stationarity in annual and seasonal series of peak flow and precipitation in the UK

Natural Hazards and Earth System Science ◽

10.5194/nhess-14-1125-2014 ◽

2014 ◽

Vol 14 (5) ◽

pp. 1125-1144 ◽

Cited By ~ 43

Author(s):

I. Prosdocimi ◽

T. R. Kjeldsen ◽

C. Svensson

Keyword(s):

Statistical Power ◽

Public Perception ◽

River Flow ◽

Daily Rainfall ◽

Type I ◽

Design Standards ◽

Safety Margins ◽

Error Probabilities ◽

Log Normal ◽

The Uk

Abstract. When designing or maintaining an hydraulic structure, an estimate of the frequency and magnitude of extreme events is required. The most common methods to obtain such estimates rely on the assumption of stationarity, i.e. the assumption that the stochastic process under study is not changing. The public perception and worry of a changing climate have led to a wide debate on the validity of this assumption. In this work trends for annual and seasonal maxima in peak river flow and catchment-average daily rainfall are explored. Assuming a two-parameter log-normal distribution, a linear regression model is applied, allowing the mean of the distribution to vary with time. For the river flow data, the linear model is extended to include an additional variable, the 99th percentile of the daily rainfall for a year. From the fitted models, dimensionless magnification factors are estimated and plotted on a map, shedding light on whether or not geographical coherence can be found in the significant changes. The implications of the identified trends from a decision-making perspective are then discussed, in particular with regard to the Type I and Type II error probabilities. One striking feature of the estimated trends is that the high variability found in the data leads to very inconclusive test results. Indeed, for most stations it is impossible to make a statement regarding whether or not the current design standards for the 2085 horizon can be considered precautionary. The power of tests on trends is further discussed in the light of statistical power analysis and sample size calculations. Given the observed variability in the data, sample sizes of some hundreds of years would be needed to confirm or negate the current safety margins when using at-site analysis.

Download Full-text

Increasing Power for Tests of Genetic Association in the Presence of Phenotype and/or Genotype Error by Use of Double-Sampling

Statistical Applications in Genetics and Molecular Biology ◽

10.2202/1544-6115.1085 ◽

2004 ◽

Vol 3 (1) ◽

pp. 1-32 ◽

Cited By ~ 46

Author(s):

Derek Gordon ◽

Yaning Yang ◽

Chad Haynes ◽

Stephen J Finch ◽

Nancy R Mendell ◽

...

Keyword(s):

Statistical Power ◽

Type I Error ◽

Apoe Genotype ◽

Marker Locus ◽

Case Control ◽

Ratio Test ◽

Type I ◽

Significance Level ◽

Population Frequency ◽

Error Probabilities

Phenotype and/or genotype misclassification can: significantly increase type II error probabilities for genetic case/control association, causing decrease in statistical power; and produce inaccurate estimates of population frequency parameters. We present a method, the likelihood ratio test allowing for errors (LRTae) that incorporates double-sample information for phenotypes and/or genotypes on a sub-sample of cases/controls. Population frequency parameters and misclassification probabilities are determined using a double-sample procedure as implemented in the Expectation-Maximization (EM) method. We perform null simulations assuming a SNP marker or a 4-allele (multi-allele) marker locus. To compare our method with the standard method that makes no adjustment for errors (LRTstd), we perform power simulations using a 2^k factorial design with high and low settings of: case/control samples, phenotype/genotype costs, double-sampled phenotypes/genotypes costs, phenotype/genotype error, and proportions of double-sampled individuals. All power simulations are performed fixing equal costs for the LRTstd and LRTae methods. We also consider case/control ApoE genotype data for an actual Alzheimer's study.The LRTae method maintains correct type I error proportions for all null simulations and all significance level thresholds (10%, 5%, 1%). LRTae average estimates of population frequencies and misclassification probabilities are equal to the true values, with variances of 10e-7 to 10e-8. For power simulations, the median power difference LRTae-LRTstd at the 5% significance level is 0.06 for multi-allele data and 0.01 for SNP data. For the ApoE data example, the LRTae and LRTstd p-values are 5.8 x 10e-5 and 1.6 x 10e-3, respectively. The increase in significance is due to adjustment in the LRTae for misclassification of the most commonly reported risk allele. We have developed freely available software that performs our LRTae statistic.

Download Full-text

How to Detect Publication Bias in Psychological Research

Zeitschrift für Psychologie ◽

10.1027/2151-2604/a000386 ◽

2019 ◽

Vol 227 (4) ◽

pp. 261-279 ◽

Cited By ~ 2

Author(s):

Frank Renkewitz ◽

Melanie Keiner

Keyword(s):

Publication Bias ◽

Effect Size ◽

Statistical Power ◽

Type I Error ◽

Psychological Research ◽

Type I ◽

True Effect Size ◽

Questionable Research Practices ◽

True Effect ◽

Meta Analyses

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.

Download Full-text

Estimating the effective sample size in association studies of quantitative traits

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab057 ◽

2021 ◽

Author(s):

Andrey Ziyatdinov ◽

Jihye Kim ◽

Dmitry Prokopenko ◽

Florian Privé ◽

Fabien Laporte ◽

...

Keyword(s):

Statistical Power ◽

Quantitative Traits ◽

Mixed Model ◽

Association Studies ◽

Effective Sample Size ◽

Environment Interaction ◽

Uk Biobank ◽

Gene Environment Interaction ◽

Gene Environment ◽

The Uk

Abstract The effective sample size (ESS) is a metric used to summarize in a single term the amount of correlation in a sample. It is of particular interest when predicting the statistical power of genome-wide association studies (GWAS) based on linear mixed models. Here, we introduce an analytical form of the ESS for mixed-model GWAS of quantitative traits and relate it to empirical estimators recently proposed. Using our framework, we derived approximations of the ESS for analyses of related and unrelated samples and for both marginal genetic and gene-environment interaction tests. We conducted simulations to validate our approximations and to provide a quantitative perspective on the statistical power of various scenarios, including power loss due to family relatedness and power gains due to conditioning on the polygenic signal. Our analyses also demonstrate that the power of gene-environment interaction GWAS in related individuals strongly depends on the family structure and exposure distribution. Finally, we performed a series of mixed-model GWAS on data from the UK Biobank and confirmed the simulation results. We notably found that the expected power drop due to family relatedness in the UK Biobank is negligible.

Download Full-text

A platform trial in practice: adding a new experimental research arm to the ongoing confirmatory FLAIR trial in chronic lymphocytic leukaemia

Trials ◽

10.1186/s13063-020-04971-2 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Dena R. Howard ◽

Anna Hockaday ◽

Julia M. Brown ◽

Walter M. Gregory ◽

Susan Todd ◽

...

Keyword(s):

Chronic Lymphocytic Leukaemia ◽

Type I Error ◽

Lymphocytic Leukaemia ◽

Experimental Therapy ◽

Type I ◽

Open Label ◽

Additional Funding ◽

Trial Structure ◽

Randomised Controlled ◽

The Uk

Abstract Background The FLAIR trial in chronic lymphocytic leukaemia has a randomised, controlled, open-label, confirmatory, platform design. FLAIR was successfully amended to include an emerging promising experimental therapy to expedite its assessment, greatly reducing the time to reach the primary outcome compared to running a separate trial and without compromising the validity of the research or the ability to recruit to the trial and report the outcomes. The methodological and practical issues are presented, describing how they were addressed to ensure the amendment was a success. Methods FLAIR was designed as a two-arm trial requiring 754 patients. In stage 2, two new arms were added: a new experimental arm and a second control arm to protect the trial in case of a change in practice. In stage 3, the original experimental arm was closed as its planned recruitment target was reached. In total, 1516 participants will be randomised to the trial. Results The changes to the protocol and randomisation to add and stop arms were made seamlessly without pausing recruitment. The statistical considerations to ensure the results for the original and new hypotheses are unbiased were approved following peer review by oversight committees, Cancer Research UK, ethical and regulatory committees and pharmaceutical partners. These included the use of concurrent comparators in case of any stage effect, appropriate control of the type I error rate and consideration of analysis methods across trial stages. The operational aspects of successfully implementing the amendments are described, including gaining approvals and additional funding, data management requirements and implementation at centres. Conclusions FLAIR is an exemplar of how an emerging experimental therapy can be assessed within an existing trial structure without compromising the conduct, reporting or validity of the trial. This strategy offered considerable resource savings and allowed the new experimental therapy to be assessed within a confirmatory trial in the UK years earlier than would have otherwise been possible. Despite the clear efficiencies, treatment arms are rarely added to ongoing trials in practice. This paper demonstrates how this strategy is acceptable, feasible and beneficial to patients and the wider research community. Trial registration ISRCTN Registry ISRCTN01844152. Registered on August 08, 2014

Download Full-text

A powerful and efficient two-stage method for detecting gene-to-gene interactions in GWAS

Biostatistics ◽

10.1093/biostatistics/kxw060 ◽

2017 ◽

Vol 18 (3) ◽

pp. 477-494 ◽

Cited By ~ 5

Author(s):

Jakub Pecanka ◽

Marianne A. Jonker ◽

Zoltan Bochdanovits ◽

Aad W. Van Der Vaart ◽

Keyword(s):

Complex Traits ◽

Multiple Testing ◽

Statistical Power ◽

Genome Wide Association Study ◽

Score Test ◽

Interaction Model ◽

Type I ◽

Two Stage ◽

Genome Wide ◽

Strong Control

Summary For over a decade functional gene-to-gene interaction (epistasis) has been suspected to be a determinant in the “missing heritability” of complex traits. However, searching for epistasis on the genome-wide scale has been challenging due to the prohibitively large number of tests which result in a serious loss of statistical power as well as computational challenges. In this article, we propose a two-stage method applicable to existing case-control data sets, which aims to lessen both of these problems by pre-assessing whether a candidate pair of genetic loci is involved in epistasis before it is actually tested for interaction with respect to a complex phenotype. The pre-assessment is based on a two-locus genotype independence test performed in the sample of cases. Only the pairs of loci that exhibit non-equilibrium frequencies are analyzed via a logistic regression score test, thereby reducing the multiple testing burden. Since only the computationally simple independence tests are performed for all pairs of loci while the more demanding score tests are restricted to the most promising pairs, genome-wide association study (GWAS) for epistasis becomes feasible. By design our method provides strong control of the type I error. Its favourable power properties especially under the practically relevant misspecification of the interaction model are illustrated. Ready-to-use software is available. Using the method we analyzed Parkinson’s disease in four cohorts and identified possible interactions within several SNP pairs in multiple cohorts.

Download Full-text

Power-Cost Efficiency of Eight Macrobenthic Sampling Schemes in Puget Sound, Washington, USA

Canadian Journal of Fisheries and Aquatic Sciences ◽

10.1139/f89-267 ◽

1989 ◽

Vol 46 (12) ◽

pp. 2157-2165 ◽

Cited By ~ 32

Author(s):

Steven P. Ferraro ◽

Faith A. Cole ◽

Waldemar A. DeBen ◽

Richard C. Swartz

Keyword(s):

Cost Efficiency ◽

Statistical Power ◽

Rank Order ◽

Puget Sound ◽

Sampling Scheme ◽

Total Biomass ◽

Type I ◽

Power Cost ◽

Sampling Schemes ◽

Sample Unit

Power-cost efficiency (PCEi = (n × c)min/(ni × ci), where i = sampling scheme, n = minimum number of replicate samples needed to detect a difference between locations with an acceptable probability of Type I (α) and Type II (β) error (e.g. α = β = 0.05), c = mean "cost," in time or money, per replicate sample, and (n × c)min = minimum value of (n × c) among the i sampling schemes) is the appropriate expression for comparing the cost efficiency of alternative sampling schemes having equivalent statistical rigor when the statistical model is a redistribution for comparisons of two means. PCEs were determined for eight macrobenthic sampling schemes (four sample unit sizes and two sieve mesh sizes) in a comparison of a reference site versus a putative polluted site in Puget Sound, Washington. Laboratory processing times were, on average, about 2.5 times greater for the [Formula: see text]- than the [Formula: see text] samples. The 0.06-m2, 0- to 8-cm-deep sample unit size and 1.0-mm sieve mesh size was the overall optimum sampling scheme in this study; it ranked first in PCE on 8 and second on 3 of 11 measures of community structure. Rank order by statistical power of the 11 measures for this scheme was Infaunal Index > log10 (mollusc biomass + 1) > number of species > log10 (numerical abundance) > log10 (polychaete biomass + 1) > log10 (total biomass + 1) > log10 (crustacean biomass + 1) > McIntosh's index > 1 – Simpson's Index > Shannon's Index > Dominance Index.

Download Full-text

Evidence of the three main clonalToxoplasma gondiilineages from wild mammalian carnivores in the UK

Parasitology ◽

10.1017/s0031182013001169 ◽

2013 ◽

Vol 140 (14) ◽

pp. 1768-1776 ◽

Cited By ~ 43

Author(s):

A. BURRELLS ◽

P. M. BARTLEY ◽

I. A. ZIMMER ◽

S. ROY ◽

A. C. KITCHENER ◽

...

Keyword(s):

Toxoplasma Gondii ◽

Rflp Markers ◽

Type I ◽

Type Ii ◽

Red Foxes ◽

Mammal Species ◽

Tissue Samples ◽

Pcr Rflp ◽

Few Data ◽

The Uk

SUMMARYToxoplasma gondiiis a zoonotic pathogen defined by three main clonal lineages (types I, II, III), of which type II is most common in Europe. Very few data exist on the prevalence and genotypes ofT. gondiiin the UK. Wildlife can act as sentinel species forT. gondiigenotypes present in the environment, which may subsequently be transmitted to livestock and humans. DNA was extracted from tissue samples of wild British carnivores, including 99 ferrets, 83 red foxes, 70 polecats, 65 mink, 64 badgers and 9 stoats. Parasite DNA was detected using a nested ITS1 PCR specific forT. gondii, PCR positive samples were subsequently genotyped using five PCR–RFLP markers.Toxoplasma gondiiDNA was detected within all these mammal species and prevalence varied from 6·0 to 44·4% depending on the host. PCR–RFLP genotyping identified type II as the predominant lineage, but type III and type I alleles were also identified. No atypical or mixed genotypes were identified within these animals. This study demonstrates the presence of alleles for all three clonal lineages with potential for transmission to cats and livestock. This is the first DNA-based study ofT. gondiiprevalence and genotypes across a broad range of wild British carnivores.

Download Full-text

TEMPORAL RAINFALL DISAGGREGATION BY A SIMPLE RANDOM CASCADE MODEL

Jurnal Teknologi ◽

10.11113/jt.v80.10957 ◽

2018 ◽

Vol 80 (6) ◽

Author(s):

Siti Mariam Saad ◽

Abdul Aziz Jemain ◽

Noriszura Ismail

Keyword(s):

Rainfall Variability ◽

Daily Rainfall ◽

Poisson Model ◽

Rain Gauge ◽

Peninsular Malaysia ◽

Cascade Model ◽

Statistical Measures ◽

Hourly Rainfall ◽

Cascade Models ◽

Log Normal

This study evaluates the utility and suitability of a simple discrete multiplicative random cascade model for temporal rainfall disaggregation. Two of a simple random cascade model, namely log-Poisson and log-Normal models are applied to simulate hourly rainfall from daily rainfall at seven rain gauge stations in Peninsular Malaysia. The cascade models are evaluated based on the capability to simulate data that preserve three important properties of observed rainfall: rainfall variability, intermittency and extreme events. The results show that both cascade models are able to simulate reasonably well the commonly used statistical measures for rainfall variability (e.g. mean and standard deviation) of hourly rainfall. With respect to rainfall intermittency, even though both models are underestimated, the observed dry proportion, log-Normal model is likely to simulate number of dry spells better than log-Poisson model. In terms of rainfall extremes, it is demonstrated that log-Poisson and log-Normal models gave a satisfactory performance for most of the studied stations herein, except for Dungun and Kuala Krai stations, which both located in the east part of Peninsula.

Download Full-text

A Multi-faceted Mess: A Review of Statistical Power Analysis in Psychology Journal Articles

10.31234/osf.io/3bdfu ◽

2019 ◽

Cited By ~ 2

Author(s):

Rob Cribbie ◽

Nataly Beribisky ◽

Udi Alter

Keyword(s):

Sample Size ◽

Effect Size ◽

Power Analysis ◽

Statistical Power ◽

Type I Error ◽

A Priori ◽

Type I ◽

Specific Level ◽

Maximum Sample Size ◽

Power Analyses

Many bodies recommend that a sample planning procedure, such as traditional NHST a priori power analysis, is conducted during the planning stages of a study. Power analysis allows the researcher to estimate how many participants are required in order to detect a minimally meaningful effect size at a specific level of power and Type I error rate. However, there are several drawbacks to the procedure that render it “a mess.” Specifically, the identification of the minimally meaningful effect size is often difficult but unavoidable for conducting the procedure properly, the procedure is not precision oriented, and does not guide the researcher to collect as many participants as feasibly possible. In this study, we explore how these three theoretical issues are reflected in applied psychological research in order to better understand whether these issues are concerns in practice. To investigate how power analysis is currently used, this study reviewed the reporting of 443 power analyses in high impact psychology journals in 2016 and 2017. It was found that researchers rarely use the minimally meaningful effect size as a rationale for the chosen effect in a power analysis. Further, precision-based approaches and collecting the maximum sample size feasible are almost never used in tandem with power analyses. In light of these findings, we offer that researchers should focus on tools beyond traditional power analysis when sample planning, such as collecting the maximum sample size feasible.

Download Full-text