Forecasting the Market with Machine Learning Algorithms: An Application of NMC-BERT-LSTM-DQN-X Algorithm in Quantitative Trading

Chang Liu; Jie Yan; Feiyue Guo; Min Guo

doi:10.1145/3488378

Forecasting the Market with Machine Learning Algorithms: An Application of NMC-BERT-LSTM-DQN-X Algorithm in Quantitative Trading

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3488378 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-22

Author(s):

Chang Liu ◽

Jie Yan ◽

Feiyue Guo ◽

Min Guo

Keyword(s):

Machine Learning ◽

Stock Market ◽

Mean Square Error ◽

Short Term Memory ◽

The State ◽

Machine Learning Algorithms ◽

Future Market ◽

Mean Square ◽

Market Data ◽

Market Trends

Although machine learning (ML) algorithms have been widely used in forecasting the trend of stock market indices, they failed to consider the following crucial aspects for market forecasting: (1) that investors’ emotions and attitudes toward future market trends have material impacts on market trend forecasting (2) the length of past market data should be dynamically adjusted according to the market status and (3) the transition of market statutes should be considered when forecasting market trends. In this study, we proposed an innovative ML method to forecast China's stock market trends by addressing the three issues above. Specifically, sentimental factors (see Appendix [1] for full trans) were first collected to measure investors’ emotions and attitudes. Then, a non-stationary Markov chain (NMC) model was used to capture dynamic transitions of market statutes. We choose the state-of-the-art (SOTA) method, namely, Bidirectional Encoder Representations from Transformers ( BERT ), to predict the state of the market at time t , and a long short-term memory ( LSTM ) model was used to estimate the varying length of past market data in market trend prediction, where the input of LSTM (the state of the market at time t ) was the output of BERT and probabilities for opening and closing of the gates in the LSTM model were based on outputs of the NMC model. Finally, the optimum parameters of the proposed algorithm were calculated using a reinforced learning-based deep Q-Network. Compared to existing forecasting methods, the proposed algorithm achieves better results with a forecasting accuracy of 61.77%, annualized return of 29.25%, and maximum losses of −8.29%. Furthermore, the proposed model achieved the lowest forecasting error: mean square error (0.095), root mean square error (0.0739), mean absolute error (0.104), and mean absolute percent error (15.1%). As a result, the proposed market forecasting model can help investors obtain more accurate market forecast information.

Download Full-text

Predicting the Mechanical Properties of RCA-Based Concrete Using Supervised Machine Learning Algorithms

Materials ◽

10.3390/ma15020647 ◽

2022 ◽

Vol 15 (2) ◽

pp. 647

Author(s):

Meijun Shang ◽

Hejun Li ◽

Ayaz Ahmad ◽

Waqas Ahmad ◽

Krzysztof Adam Ostrowski ◽

...

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Mean Square Error ◽

Coarse Aggregate ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Environmental Damage ◽

Fine Aggregate ◽

Mean Square ◽

The Impact

Environment-friendly concrete is gaining popularity these days because it consumes less energy and causes less damage to the environment. Rapid increases in the population and demand for construction throughout the world lead to a significant deterioration or reduction in natural resources. Meanwhile, construction waste continues to grow at a high rate as older buildings are destroyed and demolished. As a result, the use of recycled materials may contribute to improving the quality of life and preventing environmental damage. Additionally, the application of recycled coarse aggregate (RCA) in concrete is essential for minimizing environmental issues. The compressive strength (CS) and splitting tensile strength (STS) of concrete containing RCA are predicted in this article using decision tree (DT) and AdaBoost machine learning (ML) techniques. A total of 344 data points with nine input variables (water, cement, fine aggregate, natural coarse aggregate, RCA, superplasticizers, water absorption of RCA and maximum size of RCA, density of RCA) were used to run the models. The data was validated using k-fold cross-validation and the coefficient correlation coefficient (R2), mean square error (MSE), mean absolute error (MAE), and root mean square error values (RMSE). However, the model’s performance was assessed using statistical checks. Additionally, sensitivity analysis was used to determine the impact of each variable on the forecasting of mechanical properties.

Download Full-text

Model Evaluation for Forecasting Traffic Accident Severity in Rainy Seasons Using Machine Learning Algorithms: Seoul City Study

Applied Sciences ◽

10.3390/app10010129 ◽

2019 ◽

Vol 10 (1) ◽

pp. 129 ◽

Cited By ~ 3

Author(s):

Jonghak Lee ◽

Taekwan Yoon ◽

Sangil Kwon ◽

Jongtae Lee

Keyword(s):

Machine Learning ◽

Random Forest ◽

Linear Regression ◽

Mean Square Error ◽

Model Evaluation ◽

Traffic Accident ◽

Negative Binomial ◽

Machine Learning Algorithms ◽

Mean Square ◽

Road Geometry

There have been numerous studies on traffic accidents and their severity, particularly in relation to weather conditions and road geometry. In these studies, traditional statistical methods have been employed, such as linear regression, logistic regression, and negative binomial regression modeling, which are the most common linear and non-linear regression analysis methods. In this research, machine learning architecture was applied to this problem using the random forest, artificial neural network, and decision tree techniques to ascertain the strengths and weaknesses of these methods. Three data sets were used: road geometry data, precipitation data, and traffic accident data over nine years corresponding to the Naebu Expressway, which is located in Seoul, Korea. For the model evaluation, three measures were employed: the out-of-bag estimate of error rate (OOB), mean square error (MSE), and root mean square error (RMSE). The low mean OOB, MSE, and RMSE observed in the results obtained using the proposed random forest model demonstrate its accuracy.

Download Full-text

Unsupervised Learning Based Stock Price Recommendation using Collaborative Filtering

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1932.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 2051-2055

Keyword(s):

Machine Learning ◽

Stock Market ◽

Collaborative Filtering ◽

Portfolio Management ◽

Recommender System ◽

Stock Price ◽

Stock Exchange ◽

Performance Measure ◽

Machine Learning Algorithms ◽

Market Data

In this study, 17 stock market data were adopted for long term Prediction of stock price. Now days, Stock market data have got a significant role for invest finance in portfolio management. The various non-linear algorithms and statistical models are used for forecasting of financial data. In this article, we have used application of recommender system for this purpose. We primarily focused on use of machine learning algorithms for developing a stock market data recommender system. Machine learning has become a widely operational tool in financial recommendation systems. Here we considered the daily wise equity trading of Nifty 50 from National Stock Exchange (NSE) of 50 companies in 10 different sectors around 5986 days’ transactions as data. We adopted k-Nearest Neighbors classification algorithm to classify users based recommender system. Collaborative filtering method uses for recommend the stock, the performance measure through RMSE, and R2. The result also reveals that k-NN algorithm shown more accuracy as compare to other existing methods

Download Full-text

Different Techniques used in Stock Market Prediction

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.e9275.069520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 60-62

Keyword(s):

Machine Learning ◽

Time Series ◽

Stock Market ◽

Time Series Data ◽

Short Term Memory ◽

Moving Average ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Series Data ◽

Autoregressive Integrated Moving Average

The stock market has been one of the primary revenue streams for many for years. The stock market is often incalculable and uncertain; therefore predicting the ups and downs of the stock market is an uphill task even for the financial experts, which they been trying to tackle without any little success. But it is now possible to predict stock markets due to rapid improvement in technology which led to better processing speed and more accurate algorithms. It is necessary to forswear the misconception that prediction of stock market is only meant for people who have expertise in finance; hence an application can be developed to guide the user about the tempo of the stock market and risk associated with it.The prediction of prices in stock market is a complicated task, and there are various techniques that are used to solve the problem, this paper investigates some of these techniques and compares the accuracy of each of the methods. Forecasting the time series data is important topic in many economics, statistics, finance and business. Of the many techniques in forecasting time series data such as the Autoregressive, Moving Average, and the Autoregressive Integrated Moving Average, it is the Autoregressive Integrated Moving Average that has higher accuracy and higher precision than other methods. And with recent advancement in computational power of processors and advancement in knowledge of machine learning techniques and deep learning, new algorithms could be made to tackle the problem of predicting the stock market. This paper investigates one of such machine learning algorithms to forecast time series data such as Long Short Term Memory. It is compared with traditional algorithms such as the ARIMA method, to determine how superior the LSTM is compared to the traditional methods for predicting the stock market.

Download Full-text

Predicting stock market trends using machine learning algorithms via public sentiment and political situation analysis

Soft Computing ◽

10.1007/s00500-019-04347-y ◽

2019 ◽

Vol 24 (15) ◽

pp. 11019-11043 ◽

Cited By ~ 4

Author(s):

Wasiat Khan ◽

Usman Malik ◽

Mustansar Ali Ghazanfar ◽

Muhammad Awais Azam ◽

Khaled H. Alyoubi ◽

...

Keyword(s):

Machine Learning ◽

Stock Market ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Situation Analysis ◽

Political Situation ◽

Public Sentiment ◽

Market Trends

Download Full-text

Prediksi Data Time Series Saham Bank BRI Dengan Mesin Belajar LSTM (Long ShortTerm Memory)

Journal of Informatic and Information Security ◽

10.31599/jiforty.v1i1.133 ◽

2020 ◽

Vol 1 (1) ◽

pp. 1-8

Author(s):

Adhitio Satyo Bayangkari Karno

Keyword(s):

Machine Learning ◽

Time Series ◽

Root Mean Square Error ◽

Mean Square Error ◽

Root Mean Square ◽

Short Term Memory ◽

Mean Square ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Abstract This study aims to measure the accuracy in predicting time series data using the LSTM (Long Short-Term Memory) machine learning method, and determine the number of epochs needed to produce a small RMSE (Root Mean Square Error) value. The result of this research is a high level of variation in RMSE value to the number of epochs needed in the data processing. This variation is quite difficult to obtain the right epoch value. By doing an iteration of the LSTM process on the number of different epochs (visualized in the graph), then the number of epochs with a minimum RMSE value will be easier to obtain. From the research of BBRI's stock data prediction, a good RMSE value was obtained (RMSE = 227.470333244533). Keywords: long short-term memory, machine learning, epoch, root mean square error, mean square error. Abstrak Penelitian ini bertujuan untuk mengukur ketelitian dalam memprediksi data time series menggunakan metode mesin belajar LSTM (Long Short-Term Memory), serta menentukan banyaknya epoch yang diperlukan untuk menghasilkan nilai RMSE (Root Mean Square Error) yang kecil. Hasil dari penelitian ini adalah tingkat variasi yang tinggi nilai rmse terhdap jumlah epoch yang diperlukan dalam proses pengolahan data. Variasi ini cukup menyulitkan untuk memperoleh nilai epoch yang tepat. Dengan melakukan iterasi dari proses LSTM terhadap jumlah epoch yang berbeda (di visualisasikan dalam grafik), maka jumlah epoch dengan nilai RMSE minimal akan lebih mudah diperoleh. Dari penelitan prediksi data saham BBRI diperoleh nilai RMSE yang cukup baik yaitu 227,470333244533. Kata kunci: long short-term memory, machine learning, epoch, root mean square error, mean square error.

Download Full-text

Reinforced XGBoost machine learning model for sustainable intelligent agrarian applications

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-200862 ◽

2020 ◽

Vol 39 (5) ◽

pp. 7605-7620 ◽

Cited By ~ 1

Author(s):

Dhivya Elavarasan ◽

Durai Raj Vincent

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Mean Square Error ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

The Other ◽

Gradient Boosting ◽

Model Assessment ◽

Mean Square ◽

Extreme Gradient Boosting

The development in science and technical intelligence has incited to represent an extensive amount ofdata from various fields of agriculture. Therefore an objective rises up for the examination of the available data and integrating with processes like crop enhancement, yield prediction, examination of plant infections etc. Machine learning has up surged with tremendous processing techniques to perceive new contingencies in the multi-disciplinary agrarian advancements. In this pa- per a novel hybrid regression algorithm, reinforced extreme gradient boosting is proposed which displays essentially improved execution over traditional machine learning algorithms like artificial neural networks, deep Q-Network, gradient boosting, ran- dom forest and decision tree. Extreme gradient boosting constructs new models, which are essentially, decision trees learning from the mistakes of their predecessors by optimizing the gradient descent loss function. The proposed hybrid model performs reinforcement learning at every node during the node splitting process of the decision tree construction. This leads to effective utilizationofthesamplesbyselectingtheappropriatesplitattributeforenhancedperformance. Model’sperformanceisevaluated by means of Mean Square Error, Root Mean Square Error, Mean Absolute Error, and Coefficient of Determination. To assure a fair assessment of the results, the model assessment is performed on both training and test dataset. The regression diagnostic plots from residuals and the results obtained evidently delineates the fact that proposed hybrid approach performs better with reduced error measure and improved accuracy of 94.15% over the other machine learning algorithms. Also the performance of probability density function for the proposed model delineates that, it can preserve the actual distributional characteristics of the original crop yield data more approximately when compared to the other experimented machine learning models.

Download Full-text

Application of LSTM and CONV1D LSTM Network in Stock Forecasting Model

Artificial Intelligence Advances ◽

10.30564/aia.v3i1.2790 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Qiaoyu Wang ◽

Kai Kang ◽

Zhihan Zhang ◽

Demou Cao

Keyword(s):

Machine Learning ◽

Stock Market ◽

Financial Market ◽

Stock Price ◽

Short Term Memory ◽

Polynomial Regression ◽

Time Series Prediction ◽

Machine Learning Algorithms ◽

Trend Prediction ◽

Lstm Network

Predicting the direction of the stock market has always been a huge challenge. Also, the way of forecasting the stock market reduces the risk in the financial market, thus ensuring that brokers can make normal returns. Despite the complexities of the stock market, the challenge has been increasingly addressed by experts in a variety of disciplines, including economics, statistics, and computer science. The introduction of machine learning, in-depth understanding of the prospects of the financial market, thus doing many experiments to predict the future so that the stock price trend has different degrees of success. In this paper, we propose a method to predict stocks from different industries and markets, as well as trend prediction using traditional machine learning algorithms such as linear regression, polynomial regression and learning techniques in time series prediction using two forms of special types of recursive neural networks: long and short time memory (LSTM) and spoken short-term memory.

Download Full-text

Integrating water quality and streamflow into prediction of chemical dosage in a drinking water treatment plant using machine learning algorithms

Water Science & Technology Water Supply ◽

10.2166/ws.2021.435 ◽

2021 ◽

Author(s):

Hui Wang ◽

Tirusew Asefa ◽

Jack Thornburgh

Keyword(s):

Machine Learning ◽

Water Quality ◽

Drinking Water ◽

Water Treatment ◽

Mean Square Error ◽

Learning Algorithms ◽

Drinking Water Treatment ◽

Machine Learning Algorithms ◽

Support Vector ◽

Mean Square

Abstract Understanding the relationship between raw water quality and chemical dosage is especially important for drinking water treatment plants (DWTP) that have multiple water sources where the ratio of different supply sources could change with seasons or in a matter of weeks in response to changing hydrologic conditions. In this study, the potential for deploying machine learning algorithms, including principal component regression (PCR), support vector regression (SVR) and long short-term memory (LSTM) neural network, are tested to build predictive models. These tools were used to estimate chemical dosage at daily time scale. Influent water quality such as pH, color, turbidity, and alkalinity, as well as chemical dosage including sulfuric acid, ferric sulfate and liquid oxygen were used to build and test these models. An 80/20 percent data split was used for training and testing model performance using correlation coefficients, relative mean square error, relative root mean square error and Nash-Sutcliffe efficiency. Results indicate, compared to PCR, both SVR and LSTM, were able to capture the nonlinear relationship between chemical dose and source water quality changes and displayed higher predictive skills. These types of models have application in real-time operational support without requiring computationally expensive physics-based models.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text