Review of Model dependence in multi-model climate ensembles: weighting, sub-selection and out-of-sample testing by G. Abramowitz et al.

Abstract. The rationale for using multi-model ensembles in climate change projections and impacts research is often based on the expectation that different models constitute independent estimates, so that a range of models allows a better characterisation of the uncertainties in the representation of the climate system than a single model. However, it is known that research groups share literature, ideas for representations of processes, parameterisations, evaluation data sets and even sections of model code. Thus, nominally different models might have similar biases because of similarities in the way they represent a subset of processes, or even be near duplicates of others, weakening the assumption that they constitute independent estimates. If there are near-replicates of some models, then treating all models equally is likely to bias the inferences made using these ensembles. The challenge is to establish the degree to which this might be true for any given application. While this issue is recognized by many in the community, quantifying and accounting for model dependence in anything other than an ad-hoc way is challenging. Here we present a synthesis of the range of disparate attempts to define, quantify and address model dependence in multi-model climate ensembles in a common conceptual framework, and provide guidance on how users can test the efficacy of approaches that move beyond the equally weighted ensemble. In the upcoming Coupled Model Intercomparison Project phase 6 (CMIP6), several new models that are closely related to existing models are anticipated, as well as large ensembles from some models. We argue that quantitatively accounting for dependence in addition to model performance, and thoroughly testing the effectiveness of the approach used will be key to a sound interpretation of the CMIP ensembles in future scientific studies.

Download Full-text

ESD Reviews: Model dependence in multi-model climate ensembles: weighting, sub-selection and out-of-sample testing

Earth System Dynamics ◽

10.5194/esd-10-91-2019 ◽

2019 ◽

Vol 10 (1) ◽

pp. 91-105 ◽

Cited By ~ 22

Author(s):

Gab Abramowitz ◽

Nadja Herger ◽

Ethan Gutmann ◽

Dorit Hammerling ◽

Reto Knutti ◽

...

Keyword(s):

Ad Hoc ◽

Model Performance ◽

Coupled Model ◽

Data Sets ◽

Model Code ◽

Model Dependence ◽

Out Of Sample ◽

Large Ensembles ◽

Sample Testing ◽

Model Ensembles

Abstract. The rationale for using multi-model ensembles in climate change projections and impacts research is often based on the expectation that different models constitute independent estimates; therefore, a range of models allows a better characterisation of the uncertainties in the representation of the climate system than a single model. However, it is known that research groups share literature, ideas for representations of processes, parameterisations, evaluation data sets and even sections of model code. Thus, nominally different models might have similar biases because of similarities in the way they represent a subset of processes, or even be near-duplicates of others, weakening the assumption that they constitute independent estimates. If there are near-replicates of some models, then treating all models equally is likely to bias the inferences made using these ensembles. The challenge is to establish the degree to which this might be true for any given application. While this issue is recognised by many in the community, quantifying and accounting for model dependence in anything other than an ad-hoc way is challenging. Here we present a synthesis of the range of disparate attempts to define, quantify and address model dependence in multi-model climate ensembles in a common conceptual framework, and provide guidance on how users can test the efficacy of approaches that move beyond the equally weighted ensemble. In the upcoming Coupled Model Intercomparison Project phase 6 (CMIP6), several new models that are closely related to existing models are anticipated, as well as large ensembles from some models. We argue that quantitatively accounting for dependence in addition to model performance, and thoroughly testing the effectiveness of the approach used will be key to a sound interpretation of the CMIP ensembles in future scientific studies.

Download Full-text

Machine Learning in Futures Markets

Journal of Risk and Financial Management ◽

10.3390/jrfm14030119 ◽

2021 ◽

Vol 14 (3) ◽

pp. 119

Author(s):

Fabian Waldow ◽

Matthias Schnaubelt ◽

Christopher Krauss ◽

Thomas Günter Fischer

Keyword(s):

Machine Learning ◽

Futures Markets ◽

Learning Models ◽

Cross Sectional ◽

Data Set ◽

Statistical Arbitrage ◽

Out Of Sample ◽

Sample Testing ◽

Arbitrage Strategy ◽

Machine Learning Models

In this paper, we demonstrate how a well-established machine learning-based statistical arbitrage strategy can be successfully transferred from equity to futures markets. First, we preprocess futures time series comprised of front months to render them suitable for our returns-based trading framework and compile a data set comprised of 60 futures covering nearly 10 trading years. Next, we train several machine learning models to predict whether the h-day-ahead return of each future out- or underperforms the corresponding cross-sectional median return. Finally, we enter long/short positions for the top/flop-k futures for a duration of h days and assess the financial performance of the resulting portfolio in an out-of-sample testing period. Thereby, we find the machine learning models to yield statistically significant out-of-sample break-even transaction costs of 6.3 bp—a clear challenge to the semi-strong form of market efficiency. Finally, we discuss sources of profitability and the robustness of our findings.

Download Full-text

Αποτίμηση κινδύνων των ευρωπαϊκών τραπεζών και συστήματα έγκαιρης προειδοποίησης

10.12681/eadd/44525 ◽

2018 ◽

Author(s):

Παντελής Σταυρούλιας

Keyword(s):

Logistic Regression ◽

Discriminant Analysis ◽

Cross Validation ◽

Multinomial Logistic Regression ◽

Classification Tree ◽

Probit Regression ◽

Macroprudential Policy ◽

Policy Makers ◽

Out Of Sample ◽

Sample Testing

Οι έγκυρες προβλέψεις χρηματοοικονομικών κρίσεων διασφάλιζαν ανέκαθεν την σταθερότητα τόσο ολόκληρου του χρηματοοικονομικού οικοδομήματος γενικότερα, όσο και του τραπεζικού τομέα ειδικότερα. Με την παρούσα διατριβή επιτυγχάνεται η πρόβλεψη συστημικών τραπεζικών κρίσεων για χώρες της EE-14 αρκετά τρίμηνα προτού αυτές γίνουν αντιληπτές με την χρησιμοποίηση των πιο διαδεδομένων μεταβλητών (μακροοικονομικών, τραπεζικών και αγοράς) μέσω δύο προσεγγίσεων, της δυαδικής και της πολυεπίπεδης. Ακολουθώντας τη δυαδική προσέγγιση, εξάγονται μοντέλα ταξινόμησης με την εφαρμογή της Διακριτής Ανάλυσης (Discriminant Analysis), της Γραμμικής Παλινδρόμησης (Linear Regression), της Λογιστικής Παλινδρόμησης (Logistic Regression) και της Παλινδρόμησης Πιθανοομάδας (Probit Regression), για την έγκαιρη πρόβλεψη των κρίσεων -12 έως -7 τρίμηνα πριν την εμφάνισή τους. Επιπροσθέτως, συγκρίνεται η απόδοση της ανωτέρω ανάλυσης χρησιμοποιώντας τις νεότερες και πλέον υποσχόμενες μεθόδους του Δέντρου Ταξινόμησης (Classification Tree), του Τυχαίου Δάσους (Random Forest) και της C5. Ταυτόχρονα προτείνεται ένα νέο μέτρο επιλογής κατωφλίων και απόδοσης προσαρμογής (GoF) των μοντέλων πρόβλεψης και μια νέα συνδυαστική (combined) μέθοδος ταξινόμησης. Προκειμένου να διερευνηθεί η απόδοση της ανωτέρω ανάλυσης, χρησιμοποιείται ο εκτός του δείγματος έλεγχος (out-of-sample testing) με τη μέθοδο της ανά χώρα σταυρωτής επικύρωσης (country-blocked cross validation). Σύμφωνα με τη μέθοδο αυτή, πραγματοποιείται η ανάλυση και εξάγονται τα μοντέλα πρόβλεψης με τη χρήση των δεκατριών από τις δεκατέσσερις χώρες του δείγματος (in-sample), εφαρμόζονται τα εξαγόμενα μοντέλα για την δέκατη τέταρτη χώρα που είχε εξαιρεθεί από το αρχικό δείγμα (out-of-sample) και ελέγχονται τα αποτελέσματα πρόβλεψης με τα πραγματικά δεδομένα της χώρας αυτής. Η παραπάνω διαδικασία επαναλαμβάνεται δεκατέσσερις φορές, αφήνοντας δηλαδή κάθε φορά μια χώρα εκτός δείγματος και τελικά εξάγεται ο μέσος όρος των επαναλήψεων. Στην παρούσα διατριβή, και χρησιμοποιώντας τον εκτός του δείγματος έλεγχο, επιτυγχάνεται η κατά 82.4% σωστή ταξινόμηση (Ακρίβεια – Accuracy), 78.4% ποσοστό Αληθινών Θετικών (Τrue Ρositive Rate - TPR) και 80.6% ποσοστό Θετικής Τιμής Πρόβλεψης (Positive Predictive Value - PPV). Σύμφωνα με την πολυεπίπεδη προσέγγιση, διακρίνονται δύο επίπεδα-περίοδοι πρόβλεψης των Συστημικών Τραπεζικών Κρίσεων. Το πρώτο επίπεδο ονομάζεται έγκαιρη πρόβλεψη (early warning) και αφορά περίοδο -12 έως -7 τρίμηνα πριν την έλευση της κρίσης ενώ το δεύτερο επίπεδο ονομάζεται καθυστερημένη πρόβλεψη (late warning) και αφορά περίοδο -6 έως -1 τρίμηνα πριν την έλευση της κρίσης. Για την πολυεπίπεδη αυτή ταξινόμηση, γίνεται χρήση των Νευρωνικών Δικτύων (Neural Networks), της Πολυωνυμικής Λογιστικής Παλινδρόμησης (Multinomial Logistic Regression) και της Πολυεπίπεδης Γραμμικής Διακριτής Ανάλυσης (Multinomial Discriminant Analysis). Εφαρμόζοντας τον ίδιο εκτός του δείγματος έλεγχο με την πρώτη προσέγγιση επιτυγχάνεται η κατά 85.7% σωστή ταξινόμηση με την βέλτιστη μέθοδο που αποδεικνύεται ότι είναι η Πολυεπίπεδη Γραμμική Διακριτή Ανάλυση. Εφαρμόζοντας την ανωτέρω ανάλυση, οι ενδιαφερόμενοι φορείς άσκησης πολιτικής (policy makers) μπορούν να ανιχνεύσουν την ύπαρξης κρίσης σε βάθος χρόνου έως τριών ετών με τα προτεινόμενα μοντέλα, χρησιμοποιώντας μόνο δεδομένα που υπάρχουν ελεύθερα προσβάσιμα στο κοινό, ασκώντας με τον τρόπο αυτό την κατάλληλη ανά περίπτωση μακροπροληπτική πολιτική (macroprudential policy).

Download Full-text

Network controllability in transmodal cortex predicts psychosis spectrum symptoms

10.1101/2020.10.01.20205336 ◽

2020 ◽

Cited By ~ 1

Author(s):

Linden Parkes ◽

Tyler M. Moore ◽

Monica E. Calkins ◽

Matthew Cieslak ◽

David R. Roalf ◽

...

Keyword(s):

Control Theory ◽

Structural Connectivity ◽

Prediction Performance ◽

Network Control ◽

Association Cortex ◽

Out Of Sample ◽

Sample Testing ◽

Functional Hierarchy ◽

Structural Connections ◽

Network Controllability

ABSTRACTBackgroundThe psychosis spectrum is associated with structural dysconnectivity concentrated in transmodal association cortex. However, understanding of this pathophysiology has been limited by an exclusive focus on the direct connections to a region. Using Network Control Theory, we measured variation in both direct and indirect structural connections to a region to gain new insights into the pathophysiology of the psychosis spectrum.MethodsWe used psychosis symptom data and structural connectivity in 1,068 youths aged 8 to 22 years from the Philadelphia Neurodevelopmental Cohort. Applying a Network Control Theory metric called average controllability, we estimated each brain region’s capacity to leverage its direct and indirect structural connections to control linear brain dynamics. Next, using non-linear regression, we determined the accuracy with which average controllability could predict negative and positive psychosis spectrum symptoms in out-of-sample testing. We also compared prediction performance for average controllability versus strength, which indexes only direct connections to a region. Finally, we assessed how the prediction performance for psychosis spectrum symptoms varied over the functional hierarchy spanning unimodal to transmodal cortex.ResultsAverage controllability outperformed strength at predicting positive psychosis spectrum symptoms, demonstrating that indexing indirect structural connections to a region improved prediction performance. Critically, improved prediction was concentrated in association cortex for average controllability, whereas prediction performance for strength was uniform across the cortex, suggesting that indexing indirect connections is crucial in association cortex.ConclusionsExamining inter-individual variation in direct and indirect structural connections to association cortex is crucial for accurate prediction of positive psychosis spectrum symptoms.

Download Full-text

Portfolio Optimization using Rank Correlation

Encyclopedia of Business Analytics and Optimization ◽

10.4018/978-1-4666-5202-6.ch167 ◽

2014 ◽

pp. 1866-1879

Author(s):

Chanaka Edirisinghe ◽

Wenjun Zhou

Keyword(s):

Portfolio Optimization ◽

Rank Correlation ◽

Financial Assets ◽

Risk Averse ◽

Out Of Sample ◽

Return Distributions ◽

Sample Testing ◽

Correlation Measures ◽

Computational Properties ◽

Critical Challenge

A critical challenge in managing quantitative funds is the computation of volatilities and correlations of the underlying financial assets. We present a study of Kendall's t coefficient, one of the best-known rank-based correlation measures, for computing the portfolio risk. Incorporating within risk-averse portfolio optimization, we show empirically that this correlation measure outperforms that of Pearson's in our out-of-sample testing with real-world financial data. This phenomenon is mainly due to the fat-tailed nature of stock return distributions. We also discuss computational properties of Kendall's t, and describe efficient procedures for incremental and one-time computation of Kendall's rank correlation.

Download Full-text

Quantifying the future lethality of terror organizations

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1901975116 ◽

2019 ◽

Vol 116 (43) ◽

pp. 21463-21468 ◽

Cited By ~ 1

Author(s):

Yang Yang ◽

Adam R. Pah ◽

Brian Uzzi

Keyword(s):

Latent Variable ◽

Warning Signal ◽

Scientific Methods ◽

Explanatory Variables ◽

Modeling Techniques ◽

Out Of Sample ◽

Sample Testing ◽

Explained Variance ◽

Human Conflict ◽

Require Data

As terror groups proliferate and grow in sophistication, a major international concern is the development of scientific methods that explain and predict insurgent violence. Approaches to estimating a group’s future lethality often require data on the group’s capabilities and resources, but by the nature of the phenomenon, these data are intentionally concealed by the organizations themselves via encryption, the dark web, back-channel financing, and misinformation. Here, we present a statistical model for estimating a terror group’s future lethality using latent-variable modeling techniques to infer a group’s intrinsic capabilities and resources for inflicting harm. The analysis introduces 2 explanatory variables that are strong predictors of lethality and raise the overall explained variance when added to existing models. The explanatory variables generate a unique early-warning signal of an individual group’s future lethality based on just a few of its first attacks. Relying on the first 10 to 20 attacks or the first 10 to 20% of a group’s lifetime behavior, our model explains about 60% of the variance in a group’s future lethality as would be explained by a group’s complete lifetime data. The model’s robustness is evaluated with out-of-sample testing and simulations. The findings’ theoretical and pragmatic implications for the science of human conflict are discussed.

Download Full-text

Book-Tax Differences as an Indicator of Financial Distress

Accounting Horizons ◽

10.2308/acch-50481 ◽

2013 ◽

Vol 27 (3) ◽

pp. 469-489 ◽

Cited By ~ 11

Author(s):

Tracy J. Noga ◽

Anne L. Schnader

Keyword(s):

Financial Distress ◽

Hazard Model ◽

Time Frame ◽

Bankruptcy Prediction ◽

Related Information ◽

Ex Ante ◽

Out Of Sample ◽

Sample Testing

SYNOPSIS: We contend that tax-related information, which has not yet been considered by extant research, can significantly improve bankruptcy prediction. We investigate the association between abnormal changes in book-tax differences (BTDs) and bankruptcy using a hazard model and out-of-sample testing as in Shumway (2001). We find that information regarding abnormal changes in BTDs significantly increases our ability to ex ante identify firms that have an increased likelihood of going bankrupt in the coming five-year period. The information provided by BTDs significantly adds information to traditional models for predicting bankruptcy, such as that proposed by Ohlson (1980), and also expands the prediction window beyond the traditional two-year time frame.

Download Full-text

A Call for Out-of-Sample Testing in Macroeconomics

SSRN Electronic Journal ◽

10.2139/ssrn.2732099 ◽

2016 ◽

Author(s):

Robert Gelfond ◽

Ryan H Murphy

Keyword(s):

Out Of Sample ◽

Sample Testing

Download Full-text

Estimating player value in American football using plus–minus models

Journal of Quantitative Analysis in Sports ◽

10.1515/jqas-2020-0033 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

R. Paul Sabin

Keyword(s):

Quantitative Analysis ◽

Field Performance ◽

American Football ◽

Football Players ◽

Model Framework ◽

Sports Analytics ◽

Starting Point ◽

Out Of Sample ◽

Sample Testing ◽

Player Performance

Abstract Calculating the value of football player’s on-field performance has been limited to scouting methods while data-driven methods are mostly limited to quarterbacks. A popular method to calculate player value in other sports are Adjusted Plus–Minus (APM) and Regularized Adjusted Plus–Minus (RAPM) models. These models have been used in other sports, most notably basketball (Rosenbaum, D. T. 2004. Measuring How NBA Players Help Their Teams Win. http://www.82games.com/comm30.htm#_ftn1; Kubatko, J., D. Oliver, K. Pelton, and D. T. Rosenbaum. 2007. “A Starting Point for Analyzing Basketball Statistics.” Journal of Quantitative Analysis in Sports 3 (3); Winston, W. 2009. Player and Lineup Analysis in the NBA. Cambridge, Massachusetts; Sill, J. 2010. “Improved NBA Adjusted +/− Using Regularization and Out-Of-Sample Testing.” In Proceedings of the 2010 MIT Sloan Sports Analytics Conference) to estimate each player’s value by accounting for those in the game at the same time. Football is less amenable to APM models due to its few scoring events, few lineup changes, restrictive positioning, and small quantity of games relative to the number of teams. More recent methods have found ways to incorporate plus–minus models in other sports such as Hockey (Macdonald, B. 2011. “A Regression-Based Adjusted Plus-Minus Statistic for NHL players.” Journal of Quantitative Analysis in Sports 7 (3)) and Soccer (Schultze, S. R., and C.-M. Wellbrock. 2018. “A Weighted Plus/Minus Metric for Individual Soccer Player Performance.” Journal of Sports Analytics 4 (2): 121–31 and Matano, F., L. F. Richardson, T. Pospisil, C. Eubanks, and J. Qin (2018). Augmenting Adjusted Plus-Minus in Soccer with Fifa Ratings. arXiv preprint arXiv:1810.08032). These models are useful in coming up with results-oriented estimation of each player’s value. In American football, many positions such as offensive lineman have no recorded statistics which hinders the ability to estimate a player’s value. I provide a fully hierarchical Bayesian plus–minus (HBPM) model framework that extends RAPM to include position-specific penalization that solves many of the shortcomings of APM and RAPM models in American football. Cross-validated results show the HBPM to be more predictive out of sample than RAPM or APM models. Results for the HBPM models are provided for both Collegiate and NFL football players as well as deeper insights into positional value and position-specific age curves.

Download Full-text