scholarly journals Modelling of stock market security price Dynamics Using market microstructure Data

2018 ◽  
Vol 22 (5) ◽  
pp. 141-153
Author(s):  
N. A.  Bilev

In modern electronic stock exchanges there is an opportunity to analyze event driven market microstructure data. This data is highly informative and describes physical price formation which makes it possible to find complex patterns in price dynamics. It is very time consuming and hard to find this kind of patterns by handcrafted rules. However, modern machine learning models are able to solve such issues automatically by learning price behavior which is always changing. The present study presents profitable trading system based on a machine learning model and market microstructure data. Data for the research was collected from Moscow stock exchange MICEX and represents a limit order book change log and all market trades of a liquid security for a certain period. Logistic regression model was used and compared to neural network models with different configuration. According to the study results logistic regression model has almost the same prediction quality as neural network models have but also has a high speed of response which is very important for stock market trading. The developed trading system has medium frequency of deals submission that lets it to avoid expensive infrastructure which is usually needed in high-frequency trading systems. At the same time, the system uses the potential of high quality market microstructure data to the full extent. This paper describes the entire process of trading system development including feature engineering, models behavior comparison and creation of trading strategy with testing on historical data.

10.2196/16374 ◽  
2020 ◽  
Vol 22 (3) ◽  
pp. e16374
Author(s):  
Subendhu Rongali ◽  
Adam J Rose ◽  
David D McManus ◽  
Adarsha S Bajracharya ◽  
Alok Kapoor ◽  
...  

Background Scalable and accurate health outcome prediction using electronic health record (EHR) data has gained much attention in research recently. Previous machine learning models mostly ignore relations between different types of clinical data (ie, laboratory components, International Classification of Diseases codes, and medications). Objective This study aimed to model such relations and build predictive models using the EHR data from intensive care units. We developed innovative neural network models and compared them with the widely used logistic regression model and other state-of-the-art neural network models to predict the patient’s mortality using their longitudinal EHR data. Methods We built a set of neural network models that we collectively called as long short-term memory (LSTM) outcome prediction using comprehensive feature relations or in short, CLOUT. Our CLOUT models use a correlational neural network model to identify a latent space representation between different types of discrete clinical features during a patient’s encounter and integrate the latent representation into an LSTM-based predictive model framework. In addition, we designed an ablation experiment to identify risk factors from our CLOUT models. Using physicians’ input as the gold standard, we compared the risk factors identified by both CLOUT and logistic regression models. Results Experiments on the Medical Information Mart for Intensive Care-III dataset (selected patient population: 7537) show that CLOUT (area under the receiver operating characteristic curve=0.89) has surpassed logistic regression (0.82) and other baseline NN models (<0.86). In addition, physicians’ agreement with the CLOUT-derived risk factor rankings was statistically significantly higher than the agreement with the logistic regression model. Conclusions Our results support the applicability of CLOUT for real-world clinical use in identifying patients at high risk of mortality.


2019 ◽  
Author(s):  
Subendhu Rongali ◽  
Adam J Rose ◽  
David D McManus ◽  
Adarsha S Bajracharya ◽  
Alok Kapoor ◽  
...  

BACKGROUND Scalable and accurate health outcome prediction using electronic health record (EHR) data has gained much attention in research recently. Previous machine learning models mostly ignore relations between different types of clinical data (ie, laboratory components, International Classification of Diseases codes, and medications). OBJECTIVE This study aimed to model such relations and build predictive models using the EHR data from intensive care units. We developed innovative neural network models and compared them with the widely used logistic regression model and other state-of-the-art neural network models to predict the patient’s mortality using their longitudinal EHR data. METHODS We built a set of neural network models that we collectively called as long short-term memory (LSTM) outcome prediction using comprehensive feature relations or in short, CLOUT. Our CLOUT models use a correlational neural network model to identify a latent space representation between different types of discrete clinical features during a patient’s encounter and integrate the latent representation into an LSTM-based predictive model framework. In addition, we designed an ablation experiment to identify risk factors from our CLOUT models. Using physicians’ input as the gold standard, we compared the risk factors identified by both CLOUT and logistic regression models. RESULTS Experiments on the Medical Information Mart for Intensive Care-III dataset (selected patient population: 7537) show that CLOUT (area under the receiver operating characteristic curve=0.89) has surpassed logistic regression (0.82) and other baseline NN models (&lt;0.86). In addition, physicians’ agreement with the CLOUT-derived risk factor rankings was statistically significantly higher than the agreement with the logistic regression model. CONCLUSIONS Our results support the applicability of CLOUT for real-world clinical use in identifying patients at high risk of mortality. CLINICALTRIAL


2021 ◽  
Vol 11 (14) ◽  
pp. 6594
Author(s):  
Yu-Chia Hsu

The interdisciplinary nature of sports and the presence of various systemic and non-systemic factors introduce challenges in predicting sports match outcomes using a single disciplinary approach. In contrast to previous studies that use sports performance metrics and statistical models, this study is the first to apply a deep learning approach in financial time series modeling to predict sports match outcomes. The proposed approach has two main components: a convolutional neural network (CNN) classifier for implicit pattern recognition and a logistic regression model for match outcome judgment. First, the raw data used in the prediction are derived from the betting market odds and actual scores of each game, which are transformed into sports candlesticks. Second, CNN is used to classify the candlesticks time series on a graphical basis. To this end, the original 1D time series are encoded into 2D matrix images using Gramian angular field and are then fed into the CNN classifier. In this way, the winning probability of each matchup team can be derived based on historically implied behavioral patterns. Third, to further consider the differences between strong and weak teams, the CNN classifier adjusts the probability of winning the match by using the logistic regression model and then makes a final judgment regarding the match outcome. We empirically test this approach using 18,944 National Football League game data spanning 32 years and find that using the individual historical data of each team in the CNN classifier for pattern recognition is better than using the data of all teams. The CNN in conjunction with the logistic regression judgment model outperforms the CNN in conjunction with SVM, Naïve Bayes, Adaboost, J48, and random forest, and its accuracy surpasses that of betting market prediction.


Author(s):  
Byunghyun Kang ◽  
Cheol Choi ◽  
Daeun Sung ◽  
Seongho Yoon ◽  
Byoung-Ho Choi

In this study, friction tests are performed, via a custom-built friction tester, on specimens of natural rubber used in automotive suspension bushings. By analyzing the problematic suspension bushings, the eleven candidate factors that influence squeak noise are selected: surface lubrication, hardness, vulcanization condition, surface texture, additive content, sample thickness, thermal aging, temperature, surface moisture, friction speed, and normal force. Through friction tests, the changes are investigated in frictional force and squeak noise occurrence according to various levels of the influencing factors. The degree of correlation between frictional force and squeak noise occurrence with the factors is determined through statistical tests, and the relationship between frictional force and squeak noise occurrence based on the test results is discussed. Squeak noise prediction models are constructed by considering the interactions among the influencing factors through both multiple logistic regression and neural network analysis. The accuracies of the two prediction models are evaluated by comparing predicted and measured results. The accuracies of the multiple logistic regression and neural network models in predicting the occurrence of squeak noise are 88.2% and 87.2%, respectively.


2021 ◽  
Author(s):  
Li Lu Wei ◽  
Yu jian

Abstract Background Hypertension is a common chronic disease in the world, and it is also a common basic disease of cardiovascular and brain complications. Overweight and obesity are the high risk factors of hypertension. In this study, three statistical methods, classification tree model, logistic regression model and BP neural network, were used to screen the risk factors of hypertension in overweight and obese population, and the interaction of risk factors was conducted Analysis, for the early detection of hypertension, early diagnosis and treatment, reduce the risk of hypertension complications, have a certain clinical significance.Methods The classification tree model, logistic regression model and BP neural network model were used to screen the risk factors of hypertension in overweight and obese people.The specificity, sensitivity and accuracy of the three models were evaluated by receiver operating characteristic curve (ROC). Finally, the classification tree CRT model was used to screen the related risk factors of overweight and obesity hypertension, and the non conditional logistic regression multiplication model was used to quantitatively analyze the interaction.Results The Youden index of ROC curve of classification tree model, logistic regression model and BP neural network model were 39.20%,37.02% ,34.85%, the sensitivity was 61.63%, 76.59%, 82.85%, the specificity was 77.58%, 60.44%, 52.00%, and the area under curve (AUC) was 0.721, 0.734,0.733, respectively. There was no significant difference in AUC between the three models (P>0.05). Classification tree CRT model and logistic regression multiplication model suggested that the interaction between NAFLD and FPG was closely related to the prevalence of overweight and obese hypertension.Conclusion NAFLD,FPG,age,TG,UA, LDL-C were the risk factors of hypertension in overweight and obese people. The interaction between NAFLD and FPG increased the risk of hypertension.


Sign in / Sign up

Export Citation Format

Share Document