Microstructure in the Machine Age

Author(s):  
David Easley ◽  
Marcos López de Prado ◽  
Maureen O’Hara ◽  
Zhibai Zhang

Abstract Understanding modern market microstructure phenomena requires large amounts of data and advanced mathematical tools. We demonstrate how machine learning can be applied to microstructural research. We find that microstructure measures continue to provide insights into the price process in current complex markets. Some microstructure features with high explanatory power exhibit low predictive power, while others with less explanatory power have more predictive power. We find that some microstructure-based measures are useful for out-of-sample prediction of various market statistics, leading to questions about market efficiency. We also show how microstructure measures can have important cross-asset effects. Our results are derived using 87 liquid futures contracts across all asset classes.

2020 ◽  
Vol 53 (4) ◽  
pp. 513-554
Author(s):  
Daniel V. Fauser ◽  
Andreas Gruener

This paper examines the prediction accuracy of various machine learning (ML) algorithms for firm credit risk. It marks the first attempt to leverage data on corporate social irresponsibility (CSI) to better predict credit risk in an ML context. Even though the literature on default and credit risk is vast, the potential explanatory power of CSI for firm credit risk prediction remains unexplored. Previous research has shown that CSI may jeopardize firm survival and thus potentially comes into play in predicting credit risk. We find that prediction accuracy varies considerably between algorithms, with advanced machine learning algorithms (e. g. random forests) outperforming traditional ones (e. g. linear regression). Random forest regression achieves an out-of-sample prediction accuracy of 89.75% for adjusted R2 due to the ability of capturing non-linearity and complex interaction effects in the data. We further show that including information on CSI in firm credit risk prediction does not consistently increase prediction accuracy. One possible interpretation of this result is that CSI does not (yet) seem to be systematically reflected in credit ratings, despite prior literature indicating that CSI increases credit risk. Our study contributes to improving firm credit risk predictions using a machine learning design and to exploring how CSI is reflected in credit risk ratings.


2019 ◽  
Vol 50 (4) ◽  
pp. 1405-1417 ◽  
Author(s):  
Drew Bowlsby ◽  
Erica Chenoweth ◽  
Cullen Hendrix ◽  
Jonathan D. Moyer

AbstractPrevious research by Goldstone et al. (2010) generated a highly accurate predictive model of state-level political instability. Notably, this model identifies political institutions – and partial democracy with factionalism, specifically – as the most compelling factors explaining when and where instability events are likely to occur. This article reassesses the model’s explanatory power and makes three related points: (1) the model’s predictive power varies substantially over time; (2) its predictive power peaked in the period used for out-of-sample validation (1995–2004) in the original study and (3) the model performs relatively poorly in the more recent period. The authors find that this decline is not simply due to the Arab Uprisings, instability events that occurred in autocracies. Similar issues are found with attempts to predict nonviolent uprisings (Chenoweth and Ulfelder 2017) and armed conflict onset and continuation (Hegre et al. 2013). These results inform two conclusions: (1) the drivers of instability are not constant over time and (2) care must be exercised in interpreting prediction exercises as evidence in favor or dispositive of theoretical mechanisms.


2020 ◽  
Vol 20 (44) ◽  
Author(s):  
Marijn Bolhuis ◽  
Brett Rayner

We leverage insights from machine learning to optimize the tradeoff between bias and variance when estimating economic models using pooled datasets. Specifically, we develop a simple algorithm that estimates the similarity of economic structures across countries and selects the optimal pool of countries to maximize out-of-sample prediction accuracy of a model. We apply the new alogrithm by nowcasting output growth with a panel of 102 countries and are able to significantly improve forecast accuracy relative to alternative pools. The algortihm improves nowcast performance for advanced economies, as well as emerging market and developing economies, suggesting that machine learning techniques using pooled data could be an important macro tool for many countries.


Author(s):  
Francesco Bloise ◽  
Paolo Brunori ◽  
Patrizio Piraino

AbstractMuch of the global evidence on intergenerational income mobility is based on sub-optimal data. In particular, two-stage techniques are widely used to impute parental incomes for analyses of lower-income countries and for estimating long-run trends across multiple generations and historical periods. We propose applying machine learning methods to improve the reliability and comparability of such estimates. Supervised learning algorithms minimize the out-of-sample prediction error in the parental income imputation and provide an objective criterion for choosing across different specifications of the first-stage equation. We use our approach on data from the United States and South Africa to show that under common conditions it can limit the bias generally associated to mobility estimates based on imputed parental income.


2021 ◽  
Vol 15 (1) ◽  
pp. 2
Author(s):  
Jonathan Felix Pfahler

Historically, exchange rate forecasting models have exhibited poor out-of-sample performances and were inferior to the random walk model. Monthly panel data from 1973 to 2014 for ten currency pairs of OECD countries are used to make out-of sample forecasts with artificial neural networks and XGBoost models. Most approaches show significant and substantial predictive power in directional forecasts. Moreover, the evidence suggests that information regarding prediction timing is a key component in the forecasting performance.


2020 ◽  
Vol 28 ◽  
pp. 102439
Author(s):  
Xiaofen Ma ◽  
Dongyan Wu ◽  
Yuanqi Mai ◽  
Guang Xu ◽  
Junzhang Tian ◽  
...  

2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Meisam Ghasedi ◽  
Maryam Sarfjoo ◽  
Iraj Bargegol

AbstractThe purpose of this study is to investigate and determine the factors affecting vehicle and pedestrian accidents taking place in the busiest suburban highway of Guilan Province located in the north of Iran and provide the most accurate prediction model. Therefore, the effective principal variables and the probability of occurrence of each category of crashes are analyzed and computed utilizing the factor analysis, logit, and Machine Learning approaches simultaneously. This method not only could contribute to achieving the most comprehensive and efficient model to specify the major contributing factor, but also it can provide officials with suggestions to take effective measures with higher precision to lessen accident impacts and improve road safety. Both the factor analysis and logit model show the significant roles of exceeding lawful speed, rainy weather and driver age (30–50) variables in the severity of vehicle accidents. On the other hand, the rainy weather and lighting condition variables as the most contributing factors in pedestrian accidents severity, underline the dominant role of environmental factors in the severity of all vehicle-pedestrian accidents. Moreover, considering both utilized methods, the machine-learning model has higher predictive power in all cases, especially in pedestrian accidents, with 41.6% increase in the predictive power of fatal accidents and 12.4% in whole accidents. Thus, the Artificial Neural Network model is chosen as the superior approach in predicting the number and severity of crashes. Besides, the good performance and validation of the machine learning is proved through performance and sensitivity analysis.


2021 ◽  
Vol 14 (3) ◽  
pp. 119
Author(s):  
Fabian Waldow ◽  
Matthias Schnaubelt ◽  
Christopher Krauss ◽  
Thomas Günter Fischer

In this paper, we demonstrate how a well-established machine learning-based statistical arbitrage strategy can be successfully transferred from equity to futures markets. First, we preprocess futures time series comprised of front months to render them suitable for our returns-based trading framework and compile a data set comprised of 60 futures covering nearly 10 trading years. Next, we train several machine learning models to predict whether the h-day-ahead return of each future out- or underperforms the corresponding cross-sectional median return. Finally, we enter long/short positions for the top/flop-k futures for a duration of h days and assess the financial performance of the resulting portfolio in an out-of-sample testing period. Thereby, we find the machine learning models to yield statistically significant out-of-sample break-even transaction costs of 6.3 bp—a clear challenge to the semi-strong form of market efficiency. Finally, we discuss sources of profitability and the robustness of our findings.


Author(s):  
Chen-Chih Chung ◽  
Oluwaseun Adebayo Bamodu ◽  
Chien-Tai Hong ◽  
Lung Chan ◽  
Hung-Wen Chiu

2021 ◽  
pp. 875697282199994
Author(s):  
Joseph F. Hair ◽  
Marko Sarstedt

Most project management research focuses almost exclusively on explanatory analyses. Evaluation of the explanatory power of statistical models is generally based on F-type statistics and the R 2 metric, followed by an assessment of the model parameters (e.g., beta coefficients) in terms of their significance, size, and direction. However, these measures are not indicative of a model’s predictive power, which is central for deriving managerial recommendations. We recommend that project management researchers routinely use additional metrics, such as the mean absolute error or the root mean square error, to accurately quantify their statistical models’ predictive power.


Sign in / Sign up

Export Citation Format

Share Document