Exploring Statistical Arbitrage Opportunities Using Machine Learning Strategy

Author(s):  
Baoqiang Zhan ◽  
Shu Zhang ◽  
Helen S. Du ◽  
Xiaoguang Yang
2021 ◽  
Vol 14 (3) ◽  
pp. 119
Author(s):  
Fabian Waldow ◽  
Matthias Schnaubelt ◽  
Christopher Krauss ◽  
Thomas Günter Fischer

In this paper, we demonstrate how a well-established machine learning-based statistical arbitrage strategy can be successfully transferred from equity to futures markets. First, we preprocess futures time series comprised of front months to render them suitable for our returns-based trading framework and compile a data set comprised of 60 futures covering nearly 10 trading years. Next, we train several machine learning models to predict whether the h-day-ahead return of each future out- or underperforms the corresponding cross-sectional median return. Finally, we enter long/short positions for the top/flop-k futures for a duration of h days and assess the financial performance of the resulting portfolio in an out-of-sample testing period. Thereby, we find the machine learning models to yield statistically significant out-of-sample break-even transaction costs of 6.3 bp—a clear challenge to the semi-strong form of market efficiency. Finally, we discuss sources of profitability and the robustness of our findings.


2021 ◽  
Vol 209 ◽  
pp. 104493
Author(s):  
Haili Liao ◽  
Hanyu Mei ◽  
Gang Hu ◽  
Bo Wu ◽  
Qi Wang

2021 ◽  
Author(s):  
Tom Young ◽  
Tristan Johnston-Wood ◽  
Volker L. Deringer ◽  
Fernanda Duarte

Predictive molecular simulations require fast, accurate and reactive interatomic potentials. Machine learning offers a promising approach to construct such potentials by fitting energies and forces to high-level quantum-mechanical data, but...


2002 ◽  
Vol 17 (2) ◽  
pp. 28-35 ◽  
Author(s):  
P. Baldi ◽  
G. Pollastri

Author(s):  
Francesc López Seguí ◽  
Ricardo Ander Egg Aguilar ◽  
Gabriel de Maeztu ◽  
Anna García-Altés ◽  
Francesc García Cuyàs ◽  
...  

Background: the primary care service in Catalonia has operated an asynchronous teleconsulting service between GPs and patients since 2015 (eConsulta), which has generated some 500,000 messages. New developments in big data analysis tools, particularly those involving natural language, can be used to accurately and systematically evaluate the impact of the service. Objective: the study was intended to examine the predictive potential of eConsulta messages through different combinations of vector representation of text and machine learning algorithms and to evaluate their performance. Methodology: 20 machine learning algorithms (based on 5 types of algorithms and 4 text representation techniques)were trained using a sample of 3,559 messages (169,102 words) corresponding to 2,268 teleconsultations (1.57 messages per teleconsultation) in order to predict the three variables of interest (avoiding the need for a face-to-face visit, increased demand and type of use of the teleconsultation). The performance of the various combinations was measured in terms of precision, sensitivity, F-value and the ROC curve. Results: the best-trained algorithms are generally effective, proving themselves to be more robust when approximating the two binary variables "avoiding the need of a face-to-face visit" and "increased demand" (precision = 0.98 and 0.97, respectively) rather than the variable "type of query"(precision = 0.48). Conclusion: to the best of our knowledge, this study is the first to investigate a machine learning strategy for text classification using primary care teleconsultation datasets. The study illustrates the possible capacities of text analysis using artificial intelligence. The development of a robust text classification tool could be feasible by validating it with more data, making it potentially more useful for decision support for health professionals.


2020 ◽  
Vol 67 (4) ◽  
pp. 1575-1580 ◽  
Author(s):  
Kyul Ko ◽  
Jang Kyu Lee ◽  
Hyungcheol Shin

Sign in / Sign up

Export Citation Format

Share Document