scholarly journals Temperature triggers provide quantitative predictions of multi-species fish spawning peaks

2020 ◽  
Author(s):  
Emma S. Choi ◽  
Erik T. Saberski ◽  
Tom Lorimer ◽  
Cameron Smith ◽  
Unduwap Kandage-don ◽  
...  

AbstractWe find a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Even more surprising is that this event-based result persists despite the large and variable number of fish species involved (up to 46), and the large and variable time interval between trigger and response (up to ~3 months). To mitigate against potential over-fitting, we make a true out-of-sample prediction for the peak summer egg abundance that will be observed at Scripps Pier this year.

PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0236541
Author(s):  
Emma S. Choi ◽  
Erik Saberski ◽  
Tom Lorimer ◽  
Cameron Smith ◽  
Unduwap Kandage-don ◽  
...  

We found a startling correlation (Pearson ρ > 0.97) between a single event in daily sea surface temperatures each spring, and peak fish egg abundance measurements the following summer, in 7 years of approximately weekly fish egg abundance data collected at Scripps Pier in La Jolla California. Even more surprising was that this event-based result persisted despite the large and variable number of fish species involved (up to 46), and the large and variable time interval between trigger and response (up to ~3 months). To mitigate potential over-fitting, we made an out-of-sample prediction beyond the publication process for the peak summer egg abundance observed at Scripps Pier in 2020 (available on bioRxiv). During peer-review, the prediction failed, and while it would be tempting to explain this away as a result of the record-breaking toxic algal bloom that occurred during the spring (9x higher concentration of dinoflagellates than ever previously recorded), a re-examination of our methodology revealed a potential source of over-fitting that had not been evaluated for robustness. This cautionary tale highlights the importance of testable true out-of-sample predictions of future values that cannot (even accidentally) be used in model fitting, and that can therefore catch model assumptions that may otherwise escape notice. We believe that this example can benefit the current push towards ecology as a predictive science and support the notion that predictions should live and die in the public domain, along with the models that made them.


Author(s):  
Renzhe Xu ◽  
Yudong Chen ◽  
Tenglong Xiao ◽  
Jingli Wang ◽  
Xiong Wang

As an important tool to measure the current situation of the whole stock market, the stock index has always been the focus of researchers, especially for its prediction. This paper uses trend types, which are received by clustering price series under multiple time scale, combined with the day-of-the-week effect to construct a categorical feature combination. Based on the historical data of six kinds of Chinese stock indexes, the CatBoost model is used for training and predicting. Experimental results show that the out-of-sample prediction accuracy is 0.55, and the long–short trading strategy can obtain average annualized return of 34.43%, which is a great improvement compared with other classical classification algorithms. Under the rolling back-testing, the model can always obtain stable returns in each period of time from 2012 to 2020. Among them, the SSESC’s long–short strategy has the best performance with an annualized return of 40.85% and a sharp ratio of 1.53. Therefore, the trend information on multiple time-scale features based on feature engineering can be learned by the CatBoost model well, which has a guiding effect on predicting stock index trends.


2018 ◽  
Vol 35 (2) ◽  
pp. 208-217 ◽  
Author(s):  
Maurits Kaptein

Purpose This paper aims to examine whether estimates of psychological traits obtained using meta-judgmental measures (as commonly present in customer relationship management database systems) or operative measures are most useful in predicting customer behavior. Design/methodology/approach Using an online experiment (N = 283), the study collects meta-judgmental and operative measures of customers. Subsequently, it compares the out-of-sample prediction error of responses to persuasive messages. Findings The study shows that operative measures – derived directly from measures of customer behavior – are more informative than meta-judgmental measures. Practical implications Using interactive media, it is possible to actively elicit operative measures. This study shows that practitioners seeking to customize their marketing communication should focus on obtaining such psychographic observations. Originality/value While currently both meta-judgmental measures and operative measures are used for customization in interactive marketing, this study directly compares their utility for the prediction of future responses to persuasive messages.


Author(s):  
David Easley ◽  
Marcos López de Prado ◽  
Maureen O’Hara ◽  
Zhibai Zhang

Abstract Understanding modern market microstructure phenomena requires large amounts of data and advanced mathematical tools. We demonstrate how machine learning can be applied to microstructural research. We find that microstructure measures continue to provide insights into the price process in current complex markets. Some microstructure features with high explanatory power exhibit low predictive power, while others with less explanatory power have more predictive power. We find that some microstructure-based measures are useful for out-of-sample prediction of various market statistics, leading to questions about market efficiency. We also show how microstructure measures can have important cross-asset effects. Our results are derived using 87 liquid futures contracts across all asset classes.


2017 ◽  
Vol 11 (2) ◽  
pp. 390-411 ◽  
Author(s):  
Feng Liu ◽  
David Pitt

AbstractIn this paper we analyse insurance claim frequency data using the bivariate negative binomial regression (BNBR) model. We use general insurance data on claims from simple third-party liability insurance and comprehensive insurance. We find that bivariate regression, with its capacity for modelling correlation between the two observed claim counts, provides both a superior fit and out-of-sample prediction compared with the more common practice of fitting univariate negative binomial regression models separately to each claim type. Noting the complexity of BNBR models and their potential for a large number of parameters, we explore the use of model shrinkage methodology, namely the least absolute shrinkage and selection operator (Lasso) and ridge regression. We find that models estimated using shrinkage methods outperform the ordinary likelihood-based models when being used to make predictions out-of-sample. We find that the Lasso performs better than ridge regression as a method of shrinkage.


1992 ◽  
Vol 24 (1) ◽  
pp. 163-169 ◽  
Author(s):  
Alicia N. Rambaldi ◽  
Hector O. Zapata ◽  
Ralph D. Christy

AbstractA credit scoring function incorporating statistical selection criteria was proposed to evaluate the credit worthiness of agricultural cooperative loans in the Fifth Farm Credit District. In-sample (1981-1986) and out-of-sample (1988) prediction performance of the selected models were evaluated using rank transformation discriminant analysis, logit, and probit. Results indicate superior out-of-sample performance for the management oriented approach relative to classification of unacceptable loans, and poor performance of the rank transformation in out-of-sample prediction.


2015 ◽  
Vol 105 (5) ◽  
pp. 481-485 ◽  
Author(s):  
Patrick Bajari ◽  
Denis Nekipelov ◽  
Stephen P. Ryan ◽  
Miaoyu Yang

We survey and apply several techniques from the statistical and computer science literature to the problem of demand estimation. To improve out-of-sample prediction accuracy, we propose a method of combining the underlying models via linear regression. Our method is robust to a large number of regressors; scales easily to very large data sets; combines model selection and estimation; and can flexibly approximate arbitrary non-linear functions. We illustrate our method using a standard scanner panel data set and find that our estimates are considerably more accurate in out-of-sample predictions of demand than some commonly used alternatives.


2015 ◽  
Vol 25 (02) ◽  
pp. 1550001 ◽  
Author(s):  
Steffen E. Eikenberry ◽  
Vasilis Z. Marmarelis

We develop an autoregressive model framework based on the concept of Principal Dynamic Modes (PDMs) for the process of action potential (AP) generation in the excitable neuronal membrane described by the Hodgkin–Huxley (H–H) equations. The model's exogenous input is injected current, and whenever the membrane potential output exceeds a specified threshold, it is fed back as a second input. The PDMs are estimated from the previously developed Nonlinear Autoregressive Volterra (NARV) model, and represent an efficient functional basis for Volterra kernel expansion. The PDM-based model admits a modular representation, consisting of the forward and feedback PDM bases as linear filterbanks for the exogenous and autoregressive inputs, respectively, whose outputs are then fed to a static nonlinearity composed of polynomials operating on the PDM outputs and cross-terms of pair-products of PDM outputs. A two-step procedure for model reduction is performed: first, influential subsets of the forward and feedback PDM bases are identified and selected as the reduced PDM bases. Second, the terms of the static nonlinearity are pruned. The first step reduces model complexity from a total of 65 coefficients to 27, while the second further reduces the model coefficients to only eight. It is demonstrated that the performance cost of model reduction in terms of out-of-sample prediction accuracy is minimal. Unlike the full model, the eight coefficient pruned model can be easily visualized to reveal the essential system components, and thus the data-derived PDM model can yield insight into the underlying system structure and function.


2005 ◽  
Vol 18 (23) ◽  
pp. 5141-5162 ◽  
Author(s):  
Michael K. Tippett ◽  
Anthony G. Barnston ◽  
David G. DeWitt ◽  
Rong-Hua Zhang

Abstract This paper is about the statistical correction of systematic errors in dynamical sea surface temperature (SST) prediction systems using linear regression approaches. The typically short histories of model forecasts create difficulties in developing regression-based corrections. The roles of sample size, predictive skill, and systematic error are examined in evaluating the benefit of a linear correction. It is found that with the typical 20 yr of available model SST forecast data, corrections are worth performing when there are substantial deviations in forecast amplitude from that determined by correlation with observations. The closer the amplitude of the uncorrected forecasts is to the optimum squared error-minimizing amplitude, the less likely is a correction to improve skill. In addition to there being less “room for improvement,” this rule is related to the expected degradation in out-of-sample skill caused by sampling error in the estimate of the regression coefficient underlying the correction. Application of multivariate [canonical correlation analysis (CCA)] correction to three dynamical SST prediction models having 20 yr of data demonstrates improvement in the cross-validated skills of tropical Pacific SST forecasts through reduction of systematic errors in pattern structure. Additional beneficial correction of errors orthogonal to the CCA modes is achieved on a per-gridpoint basis for features having smaller spatial scale. Until such time that dynamical models become freer of systematic errors, statistical corrections such as those shown here can make dynamical SST predictions more skillful, retaining their nonlinear physics while also calibrating their outputs to more closely match observations.


Sign in / Sign up

Export Citation Format

Share Document