Data-Driven Trend Forecasting in Stock Market Using Machine Learning Techniques

2020 ◽  
Vol 13 (1) ◽  
pp. 130-149
Author(s):  
Puneet Misra ◽  
Siddharth Chaurasia

Stock market movements are affected by numerous factors making it one of the most challenging problems for forecasting. This article attempts to predict the direction of movement of stock and stock indices. The study uses three classifiers - Artificial Neural Network, Random Forest and Support Vector Machine with four different representation of inputs. First representation uses raw data (open, high, low, close and volume), The second uses ten features in the form of technical indicators generated by use of technical analysis. The third and fourth portrayal presents two different ways of converting the indicator data into discrete trend data. Experimental results suggest that for raw data support vector machine provides the best results. For other representations, there is no clear winner regarding models applied, but portrayal of data by the proposed approach gave best overall results for all the models and financial series. Consistency of the results highlight the importance of feature generation and right representation of dataset to machine learning techniques.

Algorithms ◽  
2018 ◽  
Vol 11 (11) ◽  
pp. 170 ◽  
Author(s):  
Zhixi Li ◽  
Vincent Tam

Momentum and reversal effects are important phenomena in stock markets. In academia, relevant studies have been conducted for years. Researchers have attempted to analyze these phenomena using statistical methods and to give some plausible explanations. However, those explanations are sometimes unconvincing. Furthermore, it is very difficult to transfer the findings of these studies to real-world investment trading strategies due to the lack of predictive ability. This paper represents the first attempt to adopt machine learning techniques for investigating the momentum and reversal effects occurring in any stock market. In the study, various machine learning techniques, including the Decision Tree (DT), Support Vector Machine (SVM), Multilayer Perceptron Neural Network (MLP), and Long Short-Term Memory Neural Network (LSTM) were explored and compared carefully. Several models built on these machine learning approaches were used to predict the momentum or reversal effect on the stock market of mainland China, thus allowing investors to build corresponding trading strategies. The experimental results demonstrated that these machine learning approaches, especially the SVM, are beneficial for capturing the relevant momentum and reversal effects, and possibly building profitable trading strategies. Moreover, we propose the corresponding trading strategies in terms of market states to acquire the best investment returns.


RSC Advances ◽  
2014 ◽  
Vol 4 (106) ◽  
pp. 61624-61630 ◽  
Author(s):  
N. S. Hari Narayana Moorthy ◽  
Silvia A. Martins ◽  
Sergio F. Sousa ◽  
Maria J. Ramos ◽  
Pedro A. Fernandes

Classification models to predict the solvation free energies of organic molecules were developed using decision tree, random forest and support vector machine approaches and with MACCS fingerprints, MOE and PaDEL descriptors.


The advancement in cyber-attack technologies have ushered in various new attacks which are difficult to detect using traditional intrusion detection systems (IDS).Existing IDS are trained to detect known patterns because of which newer attacks bypass the current IDS and go undetected. In this paper, a two level framework is proposed which can be used to detect unknown new attacks using machine learning techniques. In the first level the known types of classes for attacks are determined using supervised machine learning algorithms such as Support Vector Machine (SVM) and Neural networks (NN). The second level uses unsupervised machine learning algorithms such as K-means. The experimentation is carried out with four models with NSL- KDD dataset in Openstack cloud environment. The Model with Support Vector Machine for supervised machine learning, Gradual Feature Reduction (GFR) for feature selection and K-means for unsupervised algorithm provided the optimum efficiency of 94.56 %.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Hua Liu ◽  
Hua Yuan ◽  
Yongmei Wang ◽  
Weiwei Huang ◽  
Hui Xue ◽  
...  

AbstractAccumulating studies appear to suggest that the risk factors for venous thromboembolism (VTE) among young-middle-aged inpatients are different from those among elderly people. Therefore, the current prediction models for VTE are not applicable to young-middle-aged inpatients. The aim of this study was to develop and externally validate a new prediction model for young-middle-aged people using machine learning methods. The clinical data sets linked with 167 inpatients with deep venous thrombosis (DVT) and/or pulmonary embolism (PE) and 406 patients without DVT or PE were compared and analysed with machine learning techniques. Five algorithms, including logistic regression, decision tree, feed-forward neural network, support vector machine, and random forest, were used for training and preparing the models. The support vector machine model had the best performance, with AUC values of 0.806–0.944 for 95% CI, 59% sensitivity and 99% specificity, and an accuracy of 87%. Although different top predictors of adverse outcomes appeared in the different models, life-threatening illness, fibrinogen, RBCs, and PT appeared to be more consistently featured by the different models as top predictors of adverse outcomes. Clinical data sets of young and middle-aged inpatients can be used to accurately predict the risk of VTE with a support vector machine model.


2018 ◽  
Vol 7 (2.32) ◽  
pp. 201
Author(s):  
G Krishna Mohan ◽  
N Yoshitha ◽  
M L.N.Lavanya ◽  
A Krishna Priya

Software reliability models access the reliability by fault prediction. Reliability is a real world phenomenon with many associated real time problems and to obtain solutions to problems quickly, accurately and acceptably a large no. of soft computing techniques has been developed. We attempt to address the software failure problems by modeling software failure data using the machine learning techniques such as support vector machine (SVM) regression and generalized additive models. The study of software reliability can be categorized into three parts: modeling, measurement, improvement. Programming unwavering quality demonstrating has developed to a point that important outcomes can be acquired by applying appropriate models to the issue; there is no single model all inclusive to every one of the circumstances. We propose different machine learning methods for the evaluation of programming unwavering quality, for example, artificial neural networks, support vector machine calculation approached. We at that point break down the outcomes from machine getting the hang of demonstrating, and contrast them with that of some summed up direct displaying procedures that are proportional to programming dependability models.  


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Sana Shokat ◽  
Rabia Riaz ◽  
Sanam Shahla Rizvi ◽  
Inayat Khan ◽  
Anand Paul

Revolution in technology is changing the way visually impaired people read and write Braille easily. Learning Braille in its native language can be more convenient for its users. This study proposes an improved backend processing algorithm for an earlier developed touchscreen-based Braille text entry application. This application is used to collect Urdu Braille data, which is then converted to Urdu text. Braille to text conversion has been done on Hindi, Arabic, Bangla, Chinese, English, and other languages. For this study, Urdu Braille Grade 1 data were collected with multiclass (39 characters of Urdu represented by class 1, Alif (ﺍ), to class 39, Bri Yay (ے). Total (N = 144) cases for each class were collected. The dataset was collected from visually impaired students from The National Special Education School. Visually impaired users entered the Urdu Braille alphabets using touchscreen devices. The final dataset contained (N = 5638) cases. Reconstruction Independent Component Analysis (RICA)-based feature extraction model is created for Braille to Urdu text classification. The multiclass was categorized into three groups (13 each), i.e., category-1 (1–13), Alif-Zaal (ﺫ - ﺍ), category-2 (14–26), Ray-Fay (ﻒ - ﺮ), and category-3 (27–39), Kaaf-Bri Yay (ے - ﻕ), to give better vision and understanding. The performance was evaluated in terms of true positive rate, true negative rate, positive predictive value, negative predictive value, false positive rate, total accuracy, and area under the receiver operating curve. Among all the classifiers, support vector machine has achieved the highest performance with a 99.73% accuracy. For comparisons, robust machine learning techniques, such as support vector machine, decision tree, and K-nearest neighbors were used. Currently, this work has been done on only Grade 1 Urdu Braille. In the future, we plan to enhance this work using Grade 2 Urdu Braille with text and speech feedback on touchscreen-based android phones.


2021 ◽  
Author(s):  
Roobaea Alroobaea ◽  
Seifeddine Mechti ◽  
Mariem Haoues ◽  
Saeed Rubaiee ◽  
Anas Ahmed ◽  
...  

Abstract Alzheimer's is the main reason for dementia, that affects frequently older adults. This disease is costly especially, in terms of treatment. In addition, Alzheimer's is one of the deaths causes in the old-age citizens. Early Alzheimer's detection helps medical staffs in this disease diagnosis, which will certainly decrease the risk of death. This made the early Alzheimer's disease detection a crucial problem in the healthcare industry. The objective of this research study is to introduce a computer-aided diagnosis system for Alzheimer's disease detection using machine learning techniques. We employed data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and the Open Access Series of Imaging Studies (OASIS) brain datasets. Common supervised machine learning techniques have been applied for automatic Alzheimer’s disease detection such as: logistic regression, support vector machine, random forest, linear discriminant analysis, etc. The best accuracy values provided by the machine learning classifiers are 99.43% and 99.10% given by respectively, logistic regression and support vector machine using ADNI dataset, whereas for the OASIS dataset, we obtained 84.33% and 83.92% given by respectively logistic regression and random forest.


2021 ◽  
Vol 12 (3) ◽  
pp. 1738-1744
Author(s):  
Shahzad Qaiser Et.al

The availability of the data has increased tremendously due to the excess usage of social media platforms like Twitter and Facebook. Due to the abundant availability of data, scientists, businesses, educationalists and other people working under different roles have started using Sentiment Analysis (SA) to get in-depth knowledge about the sentiments of the people regarding any topic of interest. There are many techniques to implement SA, and one of them is Machine Learning (ML). This study is focused on the comparison of ancient ML methods such as Naïve Bayes (NB), Decision Tree (DT), Support Vector Machine (SVM), and a modern method, i.e., Deep Learning (DL). The ML techniques are applied to a single dataset to compare their performance in terms of accuracy to understand how they perform against each other. The study found that DL performed the best with 96.41% accuracy followed by NB and SVM with 87.18% and 82.05% respectively. DT performed the poorest with 68.21% accuracy.


2021 ◽  
Author(s):  
João Daniel S. Castro

AbstractSARS-Cov-2 (Covid-19) has spread rapidly throughout the world, and especially in tropical countries already affected by outbreaks of arboviruses, such as Dengue, Zika and Chikungunya, and may lead these locations to a collapse of health systems. Thus, the present work aims to develop a methodology using a machine learning algorithm (Support Vector Machine) for the prediction and discrimination of patients affected by Covid-19 and arboviruses (DENV, ZIKV and CHIKV). Clinical data from 204 patients with both Covid-19 and arboviruses obtained from 23 scientific articles and 1 dataset were used. The developed model was able to predict 93.1% of Covid-19 cases and 82.1% of arbovirus cases, with an accuracy of 89.1% and Area under Roc Curve of 95.6%, proving to be effective in prediction and possible screening of these patients, especially those affected by Covid-19, allowing early isolation.


Sign in / Sign up

Export Citation Format

Share Document