Application of Statistical Power Analysis to the Oregon Coho Salmon (Oncorhynchus kisutch) Problem

1989 ◽  
Vol 46 (7) ◽  
pp. 1183-1187 ◽  
Author(s):  
Randall M. Peterman

Nickelson (1986; Can. J. Fish. Aquat. Sci. 43: 527–535) was unable to reject the null hypothesis (Ho) of density-independent marine survival rate for Oregon coho salmon (Oncorhynchus kisutch) when wild, private hatchery, and public hatchery stocks were analyzed separately. Thus, even though there appears to have been no consistent increase in adult abundance in recent years in spite of large increases in smolt abundance, Nickelson's analysis does not support the alternative hypothesis (HA) of density-dependent marine survival. Some fishery managers are using Nickelson's results to support proposals to increase smolt production further. I calculated statistical power for these cases, i.e. the probability that the null hypothesis of density-independence could have been rejected, even if marine survival were truly density-dependent. Power was below 0.19 for all cases, which meant that Nickelson (1986) had at least an 81% chance of making a Type II error (incorrectly accepting Ho), if Ho was actually false. Therefore, Oregon fishery managers should be cautious about making decisions on increased smolt production based on current data; they run a high risk of mistakenly assuming density-independent marine survival. More generally, managers should not take action based on a failure to reject a null hypothesis unless power is high.


2005 ◽  
Vol 62 (12) ◽  
pp. 2716-2726 ◽  
Author(s):  
Michael J Bradford ◽  
Josh Korman ◽  
Paul S Higgins

There is considerable uncertainty about the effectiveness of fish habitat restoration programs, and reliable monitoring programs are needed to evaluate them. Statistical power analysis based on traditional hypothesis tests are usually used for monitoring program design, but here we argue that effect size estimates and their associated confidence intervals are more informative because results can be compared with both the null hypothesis of no effect and effect sizes of interest, such as restoration goals. We used a stochastic simulation model to compare alternative monitoring strategies for a habitat alteration that would change the productivity and capacity of a coho salmon (Oncorhynchus kisutch) producing stream. Estimates of the effect size using a freshwater stock–recruit model were more precise than those from monitoring the abundance of either spawners or smolts. Less than ideal monitoring programs can produce ambiguous results, which are cases in which the confidence interval includes both the null hypothesis and the effect size of interest. Our model is a useful planning tool because it allows the evaluation of the utility of different types of monitoring data, which should stimulate discussion on how the results will ultimately inform decision-making.



Author(s):  
Valentin Amrhein ◽  
Fränzi Korner-Nievergelt ◽  
Tobias Roth

The widespread use of 'statistical significance' as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process (American Statistical Association, Wasserstein & Lazar 2016). We review why degrading p-values into 'significant' and 'nonsignificant' contributes to making studies irreproducible, or to making them seem irreproducible. A major problem is that we tend to take small p-values at face value, but mistrust results with larger p-values. In either case, p-values can tell little about reliability of research, because they are hardly replicable even if an alternative hypothesis is true. Also significance (p≤0.05) is hardly replicable: at a realistic statistical power of 40%, given that there is a true effect, only one in six studies will significantly replicate the significant result of another study. Even at a good power of 80%, results from two studies will be conflicting, in terms of significance, in one third of the cases if there is a true effect. This means that a replication cannot be interpreted as having failed only because it is nonsignificant. Many apparent replication failures may thus reflect faulty judgement based on significance thresholds rather than a crisis of unreplicable research. Reliable conclusions on replicability and practical importance of a finding can only be drawn using cumulative evidence from multiple independent studies. However, applying significance thresholds makes cumulative knowledge unreliable. One reason is that with anything but ideal statistical power, significant effect sizes will be biased upwards. Interpreting inflated significant results while ignoring nonsignificant results will thus lead to wrong conclusions. But current incentives to hunt for significance lead to publication bias against nonsignificant findings. Data dredging, p-hacking and publication bias should be addressed by removing fixed significance thresholds. Consistent with the recommendations of the late Ronald Fisher, p-values should be interpreted as graded measures of the strength of evidence against the null hypothesis. Also larger p-values offer some evidence against the null hypothesis, and they cannot be interpreted as supporting the null hypothesis, falsely concluding that 'there is no effect'. Information on possible true effect sizes that are compatible with the data must be obtained from the observed effect size, e.g., from a sample average, and from a measure of uncertainty, such as a confidence interval. We review how confusion about interpretation of larger p-values can be traced back to historical disputes among the founders of modern statistics. We further discuss potential arguments against removing significance thresholds, such as 'we need more stringent decision rules', 'sample sizes will decrease' or 'we need to get rid of p-values'.



1983 ◽  
Vol 40 (8) ◽  
pp. 1212-1223 ◽  
Author(s):  
Randall M. Peterman ◽  
Richard D. Routledge

Large-scale experimental manipulation of juvenile salmon (Oncorhynchus spp.) abundance can provide a test of the hypothesis of linearity in the smolt-to-adult abundance relation. However, not all manipulations will be equally informative owing to large variability in marine survival. We use Monte Carlo simulation and an analytical approximation to calculate for Oregon coho salmon (O. kisutch) the statistical power of the test involving different controlled smolt abundances and durations of experiments. One recently proposed experimental release of 48 million smolts for each of 3 yr has a relatively low power and, as a consequence, is unlikely to show clearly whether the smolt-to-adult relationship is linear. The number of smolts required for a powerful test of the hypothesis of linearity is closer to the 88 million suggested in another proposal. To prevent confounding of interpretation of results, all other human sources of variability in fish should be minimized by establishing standardized rearing and release procedures during the experiment. In addition, appropriate preexperiment data on coho food, predators, and competitors will increase effectiveness of experiments by providing information on mechanisms of change in marine survival.



1990 ◽  
Vol 47 (11) ◽  
pp. 2181-2194 ◽  
Author(s):  
L. Blair Holtby ◽  
Bruce C. Andersen ◽  
Ronald K. Kadowaki

The importance of smolt size and early ocean growth to the marine survival of coho salmon was examined over a 17-yr period at Carnation Creek, British Columbia. Comparisons of overall marine survival were made both between-years, using two smolt age-groups of different mean sizes, and within-years using observed smolt size distributions and smolt size distributions back-calculated from the scales of returning adults. Large size did not confer a consistent survival advantage but large smolts did survive better in years when marine survival was relatively poor. Marine survivals were strongly correlated with early ocean growth as estimated by the spacing of the first five ocean circuli on the scales of returning adults. Marine survival and early ocean growth were positively correlated with ocean conditions indicative of strong upwelling along the northwest coast of Vancouver Island. Neither smolt survival nor early ocean growth were correlated with regional coho smolt production. Our observations suggest that interannual variability in smolt survival was being driven by ocean conditions that determined smolt growth rates which subsequently affected the susceptibility of smolts to a size-selective predator.



2004 ◽  
Vol 61 (3) ◽  
pp. 360-373 ◽  
Author(s):  
P W Lawson ◽  
E A Logerwell ◽  
N J Mantua ◽  
R C Francis ◽  
V N Agostini

Climate variability is well known to affect the marine survival of coho salmon (Oncorhynchus kisutch) in Oregon and Washington. Marine factors have been used to explain up to 83% of the variability in Oregon coastal natural coho salmon recruitment, yet about half the variability in coho salmon recruitment comes from the freshwater life phase of the life cycle. This seeming paradox could be resolved if freshwater variability were linked to climate and climate factors influencing marine survival were correlated with those affecting freshwater survival. Effects of climate on broad-scale fluctuations in freshwater survival or production are not well known. We examined the influence of seasonal stream flows and air temperature on freshwater survival and production of two stock units: Oregon coastal natural coho salmon and Queets River coho salmon from the Washington Coast. Annual air temperatures and second winter flows correlated strongly with smolt production from both stock units. Additional correlates for the Oregon Coast stocks were the date of first fall freshets and flow during smolt outmigration. Air temperature is correlated with sea surface temperature and timing of the spring transition so that good freshwater conditions are typically associated with good marine conditions.





1990 ◽  
Vol 47 (9) ◽  
pp. 1765-1772 ◽  
Author(s):  
J. M. Emlen ◽  
R. R. Reisenbichler ◽  
A. M. McGie ◽  
T. E. Nickelson

The success of expanded salmon hatchery programs will depend strongly on the degree of density-induced diminishing returns per smolt released. Several authors have addressed the question of density-dependent mortality at sea in coho salmon (Oncorhynchus kisutch), but have come to conflicting conclusions. We believe there are compelling reasons to reinvestigate the data, and have done so for public hatchery fish, using a variety of approaches. The results provide evidence that survival of these public hatchery fish is negatively affected, directly by the number of public hatchery smolts and indirectly by the number of private hatchery smolts. These results are weak, statistically, and should be considered primarily as a caution to those who, on the basis of other published work, believe that density-dependence does not exist. The results reported here also re-emphasize the often overlooked point that inferences drawn from data are strongly biased by investigators' views of how the systems of interest work and by the statistical assumptions they make preparatory to the analysis of those data.



Author(s):  
Daniel Berner ◽  
Valentin Amrhein

A paradigm shift away from null hypothesis significance testing seems in progress. Based on simulations, we illustrate some of the underlying motivations. First, P-values vary strongly from study to study, hence dichotomous inference using significance thresholds is usually unjustified. Second, statistically significant results have overestimated effect sizes, a bias declining with increasing statistical power. Third, statistically non-significant results have underestimated effect sizes, and this bias gets stronger with higher statistical power. Fourth, the tested statistical hypotheses generally lack biological justification and are often uninformative. Despite these problems, a screen of 48 papers from the 2020 volume of the Journal of Evolutionary Biology exemplifies that significance testing is still used almost universally in evolutionary biology. All screened studies tested the default null hypothesis of zero effect with the default significance threshold of p = 0.05, none presented a pre-planned alternative hypothesis, and none calculated statistical power and the probability of ‘false negatives’ (beta error). The papers reported 49 significance tests on average. Of 41 papers that contained verbal descriptions of a ‘statistically non-significant’ result, 26 (63%) falsely claimed the absence of an effect. We conclude that our studies in ecology and evolutionary biology are mostly exploratory and descriptive. We should thus shift from claiming to “test” specific hypotheses statistically to describing and discussing many hypotheses (effect sizes) that are most compatible with our data, given our statistical model. We already have the means for doing so, because we routinely present compatibility (“confidence”) intervals covering these hypotheses.



2018 ◽  
Vol 108 (1) ◽  
pp. 15-22 ◽  
Author(s):  
David H. Gent ◽  
Paul D. Esker ◽  
Alissa B. Kriss

In null hypothesis testing, failure to reject a null hypothesis may have two potential interpretations. One interpretation is that the treatments being evaluated do not have a significant effect, and a correct conclusion was reached in the analysis. Alternatively, a treatment effect may have existed but the conclusion of the study was that there was none. This is termed a Type II error, which is most likely to occur when studies lack sufficient statistical power to detect a treatment effect. In basic terms, the power of a study is the ability to identify a true effect through a statistical test. The power of a statistical test is 1 – (the probability of Type II errors), and depends on the size of treatment effect (termed the effect size), variance, sample size, and significance criterion (the probability of a Type I error, α). Low statistical power is prevalent in scientific literature in general, including plant pathology. However, power is rarely reported, creating uncertainty in the interpretation of nonsignificant results and potentially underestimating small, yet biologically significant relationships. The appropriate level of power for a study depends on the impact of Type I versus Type II errors and no single level of power is acceptable for all purposes. Nonetheless, by convention 0.8 is often considered an acceptable threshold and studies with power less than 0.5 generally should not be conducted if the results are to be conclusive. The emphasis on power analysis should be in the planning stages of an experiment. Commonly employed strategies to increase power include increasing sample sizes, selecting a less stringent threshold probability for Type I errors, increasing the hypothesized or detectable effect size, including as few treatment groups as possible, reducing measurement variability, and including relevant covariates in analyses. Power analysis will lead to more efficient use of resources and more precisely structured hypotheses, and may even indicate some studies should not be undertaken. However, the conclusions of adequately powered studies are less prone to erroneous conclusions and inflated estimates of treatment effectiveness, especially when effect sizes are small.



2000 ◽  
Vol 57 (4) ◽  
pp. 677-686 ◽  
Author(s):  
Michael J Bradford ◽  
Ransom A Myers ◽  
James R Irvine

We describe a simple scheme for the management of coho salmon (Oncorhynchus kisutch) population aggregates that uses reference points derived from an empirical analysis of freshwater production data. We fit a rectilinear "hockey stick" model to 14 historical data sets of female spawner abundance and resulting smolt production and found that at low spawner abundance, the average productivity was about 85 smolts per female spawner. Variation in productivity among streams may be related to the quality of the stream habitat. We show how freshwater productivity can be combined with forecasts of marine survival to provide a limit reference point harvest rate. Our method will permit harvest rates to track changes in ocean productivity. We also used the historical data to estimate that, on average, a density of 19 female spawners·km-1 is required to fully seed freshwater habitats with juveniles. However, there was considerable variation among the streams that might limit the utility of this measure as a reference point. Uncertainty in the forecasts of marine survival and other parameters needs to be incorporated into our scheme before it can be considered a precautionary approach.



Sign in / Sign up

Export Citation Format

Share Document