Modelling of stock market security price Dynamics Using market microstructure Data

In modern electronic stock exchanges there is an opportunity to analyze event driven market microstructure data. This data is highly informative and describes physical price formation which makes it possible to find complex patterns in price dynamics. It is very time consuming and hard to find this kind of patterns by handcrafted rules. However, modern machine learning models are able to solve such issues automatically by learning price behavior which is always changing. The present study presents profitable trading system based on a machine learning model and market microstructure data. Data for the research was collected from Moscow stock exchange MICEX and represents a limit order book change log and all market trades of a liquid security for a certain period. Logistic regression model was used and compared to neural network models with different configuration. According to the study results logistic regression model has almost the same prediction quality as neural network models have but also has a high speed of response which is very important for stock market trading. The developed trading system has medium frequency of deals submission that lets it to avoid expensive infrastructure which is usually needed in high-frequency trading systems. At the same time, the system uses the potential of high quality market microstructure data to the full extent. This paper describes the entire process of trading system development including feature engineering, models behavior comparison and creation of trading strategy with testing on historical data.

Download Full-text

Learning Latent Space Representations to Predict Patient Outcomes: Model Development and Validation

Journal of Medical Internet Research ◽

10.2196/16374 ◽

2020 ◽

Vol 22 (3) ◽

pp. e16374

Author(s):

Subendhu Rongali ◽

Adam J Rose ◽

David D McManus ◽

Adarsha S Bajracharya ◽

Alok Kapoor ◽

...

Keyword(s):

Neural Network ◽

Risk Factors ◽

Logistic Regression ◽

Intensive Care ◽

Logistic Regression Model ◽

Outcome Prediction ◽

Network Models ◽

Neural Network Models ◽

Latent Space ◽

Different Types

Background Scalable and accurate health outcome prediction using electronic health record (EHR) data has gained much attention in research recently. Previous machine learning models mostly ignore relations between different types of clinical data (ie, laboratory components, International Classification of Diseases codes, and medications). Objective This study aimed to model such relations and build predictive models using the EHR data from intensive care units. We developed innovative neural network models and compared them with the widely used logistic regression model and other state-of-the-art neural network models to predict the patient’s mortality using their longitudinal EHR data. Methods We built a set of neural network models that we collectively called as long short-term memory (LSTM) outcome prediction using comprehensive feature relations or in short, CLOUT. Our CLOUT models use a correlational neural network model to identify a latent space representation between different types of discrete clinical features during a patient’s encounter and integrate the latent representation into an LSTM-based predictive model framework. In addition, we designed an ablation experiment to identify risk factors from our CLOUT models. Using physicians’ input as the gold standard, we compared the risk factors identified by both CLOUT and logistic regression models. Results Experiments on the Medical Information Mart for Intensive Care-III dataset (selected patient population: 7537) show that CLOUT (area under the receiver operating characteristic curve=0.89) has surpassed logistic regression (0.82) and other baseline NN models (<0.86). In addition, physicians’ agreement with the CLOUT-derived risk factor rankings was statistically significantly higher than the agreement with the logistic regression model. Conclusions Our results support the applicability of CLOUT for real-world clinical use in identifying patients at high risk of mortality.

Download Full-text

Learning Latent Space Representations to Predict Patient Outcomes: Model Development and Validation (Preprint)

10.2196/preprints.16374 ◽

2019 ◽

Author(s):

Subendhu Rongali ◽

Adam J Rose ◽

David D McManus ◽

Adarsha S Bajracharya ◽

Alok Kapoor ◽

...

Keyword(s):

Neural Network ◽

Risk Factors ◽

Logistic Regression ◽

Intensive Care ◽

Logistic Regression Model ◽

Outcome Prediction ◽

Network Models ◽

Neural Network Models ◽

Latent Space ◽

Different Types

BACKGROUND Scalable and accurate health outcome prediction using electronic health record (EHR) data has gained much attention in research recently. Previous machine learning models mostly ignore relations between different types of clinical data (ie, laboratory components, International Classification of Diseases codes, and medications). OBJECTIVE This study aimed to model such relations and build predictive models using the EHR data from intensive care units. We developed innovative neural network models and compared them with the widely used logistic regression model and other state-of-the-art neural network models to predict the patient’s mortality using their longitudinal EHR data. METHODS We built a set of neural network models that we collectively called as long short-term memory (LSTM) outcome prediction using comprehensive feature relations or in short, CLOUT. Our CLOUT models use a correlational neural network model to identify a latent space representation between different types of discrete clinical features during a patient’s encounter and integrate the latent representation into an LSTM-based predictive model framework. In addition, we designed an ablation experiment to identify risk factors from our CLOUT models. Using physicians’ input as the gold standard, we compared the risk factors identified by both CLOUT and logistic regression models. RESULTS Experiments on the Medical Information Mart for Intensive Care-III dataset (selected patient population: 7537) show that CLOUT (area under the receiver operating characteristic curve=0.89) has surpassed logistic regression (0.82) and other baseline NN models (<0.86). In addition, physicians’ agreement with the CLOUT-derived risk factor rankings was statistically significantly higher than the agreement with the logistic regression model. CONCLUSIONS Our results support the applicability of CLOUT for real-world clinical use in identifying patients at high risk of mortality. CLINICALTRIAL

Download Full-text

Using Convolutional Neural Network and Candlestick Representation to Predict Sports Match Outcomes

Applied Sciences ◽

10.3390/app11146594 ◽

2021 ◽

Vol 11 (14) ◽

pp. 6594

Author(s):

Yu-Chia Hsu

Keyword(s):

Neural Network ◽

Time Series ◽

Pattern Recognition ◽

Logistic Regression ◽

Regression Model ◽

Convolutional Neural Network ◽

Logistic Regression Model ◽

National Football League ◽

Performance Metrics ◽

Betting Market

The interdisciplinary nature of sports and the presence of various systemic and non-systemic factors introduce challenges in predicting sports match outcomes using a single disciplinary approach. In contrast to previous studies that use sports performance metrics and statistical models, this study is the first to apply a deep learning approach in financial time series modeling to predict sports match outcomes. The proposed approach has two main components: a convolutional neural network (CNN) classifier for implicit pattern recognition and a logistic regression model for match outcome judgment. First, the raw data used in the prediction are derived from the betting market odds and actual scores of each game, which are transformed into sports candlesticks. Second, CNN is used to classify the candlesticks time series on a graphical basis. To this end, the original 1D time series are encoded into 2D matrix images using Gramian angular field and are then fed into the CNN classifier. In this way, the winning probability of each matchup team can be derived based on historically implied behavioral patterns. Third, to further consider the differences between strong and weak teams, the CNN classifier adjusts the probability of winning the match by using the logistic regression model and then makes a final judgment regarding the match outcome. We empirically test this approach using 18,944 National Football League game data spanning 32 years and find that using the individual historical data of each team in the CNN classifier for pattern recognition is better than using the data of all teams. The CNN in conjunction with the logistic regression judgment model outperforms the CNN in conjunction with SVM, Naïve Bayes, Adaboost, J48, and random forest, and its accuracy surpasses that of betting market prediction.

Download Full-text

Experimental study on the factors affecting squeak noise occurrence in automotive suspension bushings

Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering ◽

10.1177/09544070211024109 ◽

2021 ◽

pp. 095440702110241

Author(s):

Byunghyun Kang ◽

Cheol Choi ◽

Daeun Sung ◽

Seongho Yoon ◽

Byoung-Ho Choi

Keyword(s):

Neural Network ◽

Logistic Regression ◽

Influencing Factors ◽

Frictional Force ◽

Prediction Models ◽

Statistical Tests ◽

Network Models ◽

Multiple Logistic Regression ◽

Neural Network Models ◽

Automotive Suspension

In this study, friction tests are performed, via a custom-built friction tester, on specimens of natural rubber used in automotive suspension bushings. By analyzing the problematic suspension bushings, the eleven candidate factors that influence squeak noise are selected: surface lubrication, hardness, vulcanization condition, surface texture, additive content, sample thickness, thermal aging, temperature, surface moisture, friction speed, and normal force. Through friction tests, the changes are investigated in frictional force and squeak noise occurrence according to various levels of the influencing factors. The degree of correlation between frictional force and squeak noise occurrence with the factors is determined through statistical tests, and the relationship between frictional force and squeak noise occurrence based on the test results is discussed. Squeak noise prediction models are constructed by considering the interactions among the influencing factors through both multiple logistic regression and neural network analysis. The accuracies of the two prediction models are evaluated by comparing predicted and measured results. The accuracies of the multiple logistic regression and neural network models in predicting the occurrence of squeak noise are 88.2% and 87.2%, respectively.

Download Full-text

Predicting death by suicide using administrative health care system data: Can feedforward neural network models improve upon logistic regression models?

Journal of Affective Disorders ◽

10.1016/j.jad.2019.07.063 ◽

2019 ◽

Vol 257 ◽

pp. 741-747 ◽

Cited By ~ 2

Author(s):

Michael Sanderson ◽

Andrew G.M. Bulloch ◽

JianLi Wang ◽

Tyler Williamson ◽

Scott B Patten

Keyword(s):

Neural Network ◽

Health Care ◽

Logistic Regression ◽

Health Care System ◽

Regression Models ◽

Network Models ◽

Feedforward Neural Network ◽

Neural Network Models ◽

Logistic Regression Models ◽

System Data

Download Full-text

Performance comparison of artificial neural network and logistic regression model for differentiating lung nodules on CT scans

Expert Systems with Applications ◽

10.1016/j.eswa.2012.04.001 ◽

2012 ◽

Vol 39 (13) ◽

pp. 11503-11509 ◽

Cited By ~ 32

Author(s):

Hui Chen ◽

Jing Zhang ◽

Yan Xu ◽

Budong Chen ◽

Kuan Zhang

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Performance Comparison ◽

Ct Scans ◽

Lung Nodules ◽

Artificial Neural

Download Full-text

Predictive ability of logistic regression, auto-logistic regression and neural network models in empirical land-use change modeling – a case study

International Journal of Geographical Information Science ◽

10.1080/13658811003752332 ◽

2011 ◽

Vol 25 (1) ◽

pp. 65-87 ◽

Cited By ~ 90

Author(s):

Yu-Pin Lin ◽

Hone-Jay Chu ◽

Chen-Fa Wu ◽

Peter H. Verburg

Keyword(s):

Neural Network ◽

Land Use ◽

Logistic Regression ◽

Land Use Change ◽

Predictive Ability ◽

Network Models ◽

Neural Network Models ◽

Land Use Change Modeling

Download Full-text

Logistic Regression Model for Loan Prediction: A Machine Learning Approach

10.1109/eti4.051663.2021.9619201 ◽

2021 ◽

Author(s):

Richa Manglani ◽

Anuja Bokhare

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Regression Model ◽

Logistic Regression Model ◽

Learning Approach ◽

Machine Learning Approach

Download Full-text

Development of a prediction model for pancreatic cancer in patients with type 2 diabetes using logistic regression and artificial neural network models

Cancer Management and Research ◽

10.2147/cmar.s180791 ◽

2018 ◽

Vol Volume 10 ◽

pp. 6317-6324 ◽

Cited By ~ 8

Author(s):

Meng Hsuen Hsieh ◽

Li-Min Sun ◽

Cheng-Li Lin ◽

Meng-Ju Hsieh ◽

Chung-Y. Hsu ◽

...

Keyword(s):

Neural Network ◽

Type 2 Diabetes ◽

Pancreatic Cancer ◽

Artificial Neural Network ◽

Logistic Regression ◽

Prediction Model ◽

Network Models ◽

Neural Network Models ◽

Artificial Neural Network Models

Download Full-text

Screening Risk Factors and Interaction Analysis of Hypertension in Overweight and Obesity Population based on Three Statistical Models

10.21203/rs.3.rs-390569/v1 ◽

2021 ◽

Author(s):

Li Lu Wei ◽

Yu jian

Keyword(s):

Neural Network ◽

Risk Factors ◽

Logistic Regression ◽

Regression Model ◽

Bp Neural Network ◽

Logistic Regression Model ◽

Classification Tree ◽

Overweight And Obesity ◽

Tree Model ◽

Classification Tree Model

Abstract Background Hypertension is a common chronic disease in the world, and it is also a common basic disease of cardiovascular and brain complications. Overweight and obesity are the high risk factors of hypertension. In this study, three statistical methods, classification tree model, logistic regression model and BP neural network, were used to screen the risk factors of hypertension in overweight and obese population, and the interaction of risk factors was conducted Analysis, for the early detection of hypertension, early diagnosis and treatment, reduce the risk of hypertension complications, have a certain clinical significance.Methods The classification tree model, logistic regression model and BP neural network model were used to screen the risk factors of hypertension in overweight and obese people.The specificity, sensitivity and accuracy of the three models were evaluated by receiver operating characteristic curve (ROC). Finally, the classification tree CRT model was used to screen the related risk factors of overweight and obesity hypertension, and the non conditional logistic regression multiplication model was used to quantitatively analyze the interaction.Results The Youden index of ROC curve of classification tree model, logistic regression model and BP neural network model were 39.20%,37.02% ,34.85%, the sensitivity was 61.63%, 76.59%, 82.85%, the specificity was 77.58%, 60.44%, 52.00%, and the area under curve (AUC) was 0.721, 0.734,0.733, respectively. There was no significant difference in AUC between the three models (P>0.05). Classification tree CRT model and logistic regression multiplication model suggested that the interaction between NAFLD and FPG was closely related to the prevalence of overweight and obese hypertension.Conclusion NAFLD,FPG,age,TG,UA, LDL-C were the risk factors of hypertension in overweight and obese people. The interaction between NAFLD and FPG increased the risk of hypertension.

Download Full-text