IntelliFin: Advanced Stock Prediction using Hybrid ML and LSTM Model with Financial Indicators powered by Sentiment Determination using NLP

Stock Trading has been one of the most important parts of the financial world for decades. People investing in the share market analyze the financial history of a corporation, the news related to it and study huge amounts of data so as to predict its stock price trend. The right investment i.e. buying and selling a company stock at the right time leads to monetary benefits and can make one a millionaire overnight. The stock market is an extremely fluctuating platform wherein data is produced in humongous quantities and is influenced by numerous disparate factors such as socio-political issues, financial activities like splits and dividends, news as well as rumors. This work proposes a novel system “IntelliFin” to predict the share market trend. The system uses the various stock market technical indicators along with the company's historical market data trends to predict the share prices. The system employs the sentiment determination of a company's financial and socio-political news for a more accurate prediction. This system is implemented using two models. The first is a hybrid LSTM model optimized by an ADAM optimizer. The other is a hybrid ML model which integrates a Support Vector Regressor, K-Nearest Neighbor classifier, an RF classifier and a Linear Regressor using a Majority Voting algorithm. Both models employ a sentiment analyzer to account for the news impacting the stock prices which is powered by NLP. The models are trained continuously using Reinforcement Learning implemented by the Q-Learning Algorithm to increase the consistency and accuracy. The project aims to support the inexperienced investors, who don't have enough experience in investing in the stock market and help them maximize their profit and minimize or eliminate the losses. The developed system will also serve as a tool for professional investors to help and aid their decision making.

Download Full-text

Efficient detection of hacker community based on twitter data using complex networks and machine learning algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210458 ◽

2021 ◽

pp. 1-17

Author(s):

Ahmed Al-Tarawneh ◽

Ja’afer Al-Saraireh

Keyword(s):

Machine Learning ◽

Complex Networks ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

Efficient Detection ◽

Suggested Keywords

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.

Download Full-text

A feature weighted support vector machine and K-nearest neighbor algorithm for stock market indices prediction

Expert Systems with Applications ◽

10.1016/j.eswa.2017.02.044 ◽

2017 ◽

Vol 80 ◽

pp. 340-355 ◽

Cited By ~ 74

Author(s):

Yingjun Chen ◽

Yongtao Hao

Keyword(s):

Support Vector Machine ◽

Stock Market ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

Bacterial Immunogenicity Prediction by Machine Learning Methods

Vaccines ◽

10.3390/vaccines8040709 ◽

2020 ◽

Vol 8 (4) ◽

pp. 709

Author(s):

Ivan Dimitrov ◽

Nevena Zaharieva ◽

Irini Doytchinova

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Predictive Ability ◽

Initial Step ◽

Majority Voting ◽

Gradient Boosting ◽

Support Vector ◽

K Nearest Neighbor ◽

Test Set ◽

Extreme Gradient Boosting

The identification of protective immunogens is the most important and vigorous initial step in the long-lasting and expensive process of vaccine design and development. Machine learning (ML) methods are very effective in data mining and in the analysis of big data such as microbial proteomes. They are able to significantly reduce the experimental work for discovering novel vaccine candidates. Here, we applied six supervised ML methods (partial least squares-based discriminant analysis, k nearest neighbor (kNN), random forest (RF), support vector machine (SVM), random subspace method (RSM), and extreme gradient boosting) on a set of 317 known bacterial immunogens and 317 bacterial non-immunogens and derived models for immunogenicity prediction. The models were validated by internal cross-validation in 10 groups from the training set and by the external test set. All of them showed good predictive ability, but the xgboost model displays the most prominent ability to identify immunogens by recognizing 84% of the known immunogens in the test set. The combined RSM-kNN model was the best in the recognition of non-immunogens, identifying 92% of them in the test set. The three best performing ML models (xgboost, RSM-kNN, and RF) were implemented in the new version of the server VaxiJen, and the prediction of bacterial immunogens is now based on majority voting.

Download Full-text

A machine-learning approach to predict postprandial hypoglycemia

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0943-4 ◽

2019 ◽

Vol 19 (1) ◽

Cited By ~ 7

Author(s):

Wonju Seo ◽

You-Bin Lee ◽

Seunghyun Lee ◽

Sang-Man Jin ◽

Sung-Min Park

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Characteristic Curve ◽

Artificial Pancreas ◽

Individual Performance ◽

Machine Learning Algorithms ◽

Prediction Algorithm ◽

Support Vector ◽

K Nearest Neighbor

Abstract Background For an effective artificial pancreas (AP) system and an improved therapeutic intervention with continuous glucose monitoring (CGM), predicting the occurrence of hypoglycemia accurately is very important. While there have been many studies reporting successful algorithms for predicting nocturnal hypoglycemia, predicting postprandial hypoglycemia still remains a challenge due to extreme glucose fluctuations that occur around mealtimes. The goal of this study is to evaluate the feasibility of easy-to-use, computationally efficient machine-learning algorithm to predict postprandial hypoglycemia with a unique feature set. Methods We use retrospective CGM datasets of 104 people who had experienced at least one hypoglycemia alert value during a three-day CGM session. The algorithms were developed based on four machine learning models with a unique data-driven feature set: a random forest (RF), a support vector machine using a linear function or a radial basis function, a K-nearest neighbor, and a logistic regression. With 5-fold cross-subject validation, the average performance of each model was calculated to compare and contrast their individual performance. The area under a receiver operating characteristic curve (AUC) and the F1 score were used as the main criterion for evaluating the performance. Results In predicting a hypoglycemia alert value with a 30-min prediction horizon, the RF model showed the best performance with the average AUC of 0.966, the average sensitivity of 89.6%, the average specificity of 91.3%, and the average F1 score of 0.543. In addition, the RF showed the better predictive performance for postprandial hypoglycemic events than other models. Conclusion In conclusion, we showed that machine-learning algorithms have potential in predicting postprandial hypoglycemia, and the RF model could be a better candidate for the further development of postprandial hypoglycemia prediction algorithm to advance the CGM technology and the AP technology further.

Download Full-text

Comparative Analysis of K-NN and Naïve Bayes Methods to Predict Stock Prices

International Journal of Computer and Information System (IJCIS) ◽

10.29040/ijcis.v2i2.32 ◽

2021 ◽

Vol 2 (2) ◽

pp. 49-53

Author(s):

Budi Soepriyanto

Keyword(s):

Stock Prices ◽

Stock Price ◽

Nearest Neighbor ◽

Naive Bayes ◽

High Accuracy ◽

Naïve Bayes ◽

Share Price ◽

Classification Methods ◽

K Nearest Neighbor ◽

Bayes Methods

Abstract— Buying and selling shares is a transaction that is widely carried out at this time, especially buying and selling stocks online which are widely available in the market, to make buying and selling shares require ability or knowledge so that the buying and selling of shares are profitable, to be able to help economic players predict prices. Profit shares or not purchased in the future, this research will conduct stock price predictions using classification methods, namely K-Nearest Neighbor and Naïve Bayes, to predict the stock price data used for one month in minute levels totalling 39065 data, based on prediction results. The highest results obtained were using Naïve Bayes with an accuracy value of 69.38 then the K-Nearest Neighbor method with a K = 5 value of 67.25%, based on these results it can be concluded that the use of the K-Nearest Neighbor and Naïve Bayes methods for prediction share price not yet owned I high accuracy, so it can be combined with other methods or by using other variable predictors.

Download Full-text

ANALYSIS OF MACHINE LEARNING METHODS FOR PREDICTIONS OF STOCK EXCHANGE SHARE PRICES

Scientific Journal of Astana IT University ◽

10.37943/aitu.2021.47.22.009 ◽

2021 ◽

pp. 94-100

Author(s):

V. Serbin ◽

U. Zhenisserov

Keyword(s):

Machine Learning ◽

Stock Market ◽

Financial Markets ◽

Stock Prices ◽

Stock Price ◽

Learning Algorithms ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Price Patterns

Since the stock market is one of the most important areas for investors, stock market price trend prediction is still a hot subject for researchers in both financial and technical fields. Lately, a lot of work has been analyzed and done in the field of machine learning algorithms for analyzing price patterns and predicting stock prices and index changes. Currently, machine-learning methods are receiving a lot of attention for predicting prices in financial markets. The main goal of current research is to improve and develop a system for predicting future prices in financial markets with higher accuracy using machine-learning methods. Precise predicting stock market returns is a very difficult task due to the volatile and non-linear nature of financial stock markets. With the advent of artificial intelligence and machine learning, forecasting methods have become more effective at predicting stock prices. In this article, we looked at the machine learning techniques that have been used to trade stocks to predict price changes before an actual rise or fall in the stock price occurs. In particular, the article discusses in detail the use of support vector machines, linear regression, and prediction using decision stumps, classification using the nearest neighbor algorithm, and the advantages and disadvantages of each method. The paper introduces parameters and variables that can be used to recognize stock price patterns that might be useful in future stock forecasting, and how the boost can be combined with other learning algorithms to improve the accuracy of such forecasting systems.

Download Full-text

Stock Market Prices Prediction using Random Forest and Extra Tree Regression

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4314.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 1224-1228

Keyword(s):

Machine Learning ◽

Stock Market ◽

Stock Prices ◽

Stock Price ◽

Stock Exchange ◽

Research Area ◽

Support Vector ◽

Market Prices ◽

Machine Learning Approach ◽

Innovative Work

Prediction of Stock price is now a day’s an existing and interesting research area in financial and academic sectors to know the scale of economies. There did not exists any significant set of rules to estimate and predict the scale of share in the stock exchange. Many evolutionary technologies are existing such as technical, fundamental, time, statistical and series analysis which help us to attempt the prediction process, but none of the methods are proved as reliable and accurate tool to the society in the estimation of stock exchange or share market scales. Here in this paper we attempted to do innovative work through Machine Learning approach to predict or sense the behaviour tracking of the stock market sensex. Linear regression, Support Vector regression, Decision Tree, Ramdom Forest Regressor and Extra Tree Regressor are the Machine Learning models implemented effectively in predicting the stock prices and define the activity between the exchanges the securities between the buyers and sellers. We predicted the price of the stock based on the closing value and stock price. An algorithm with high accuracy we do the process of comparison for the accuracy of each of the model and finally is considered as better algorithm for predicting stock price. As share market is a vague domain we cannot predict the conditions occur, and also share market can never be predicted, this job can be done easily and technically through this work and the main aim of this paper is to apply algorithms in Machine Learning in predicting the stock prices.

Download Full-text

Predicting Stock Exchange using Supervised Learning Algorithms

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4144.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 4081-4090

Keyword(s):

Machine Learning ◽

Random Forest ◽

Stock Market ◽

Supervised Learning ◽

Nearest Neighbor ◽

Market Price ◽

Real Life ◽

Support Vector ◽

K Nearest Neighbor ◽

Future Value

The stock market price trend is one of the brightest areas in the field of computer science, economics, finance, administration, etc. The stock market forecast is an attempt to determine the future value of the equity traded on a financial transaction with another financial system. The current work clearly describes the prediction of a stock using Machine Learning. The adoption of machine learning and artificial intelligence techniques to predict the prices of the stock is a growing trend. More and more researchers invest their time every day in coming up with ways to arrive at techniques that can further improve the accuracy of the stock prediction model. This paper is mainly concerned with the best model to predict the stock market value. During the mechanism of contemplating the various techniques and variables that can be taken into consideration, we discovered five models Which are based on supervised learning techniques i.e.., Support Vector Machine (SVM), Random Forest, K-Nearest Neighbor (KNN), Bernoulli Naïve Bayes.The empirical results show that SVC performs the best for large datasets and Random Forest, Naïve Bayes is the best for small datasets. The successful prediction for the stock will be a great asset for the stock The stock market price trend is one of the brightest areas in the field of computer science, economics, finance, administration, etc. The stock market forecast is an attempt to determine the future value of the equity traded on a financial transaction with another financial system. The current work clearly describes the prediction of a stock using Machine Learning. The adoption of machine learning and artificial intelligence techniques to predict the prices of the stock is a growing trend. More and more researchers invest their time every day in coming up with ways to arrive at techniques that can further improve the accuracy of the stock prediction model. This paper is mainly concerned with the best model to predict the stock market value. During the mechanism of contemplating the various techniques and variables that can be taken into consideration, we discovered five models Which are based on supervised learning techniques i.e.., Support Vector Machine (SVM), Random Forest, K-Nearest Neighbor (KNN), Bernoulli Naïve Bayes.The empirical results show that SVC performs the best for large datasets and Random Forest, Naïve Bayes is the best for small datasets. The successful prediction for the stock will be a great asset for the stock market institutions and will provide real-life solutions to the problems that stock investors face.market institutions and will provide real-life solutions to the problems that stock investors face.

Download Full-text

Review on Techniques for Plant Leaf Classification and Recognition

Computers ◽

10.3390/computers8040077 ◽

2019 ◽

Vol 8 (4) ◽

pp. 77 ◽

Cited By ~ 8

Author(s):

Muhammad Azfar Firdaus Azlah ◽

Lee Suan Chua ◽

Fakhrul Razan Rahmad ◽

Farah Izana Abdullah ◽

Sharifah Rafidah Wan Alwi

Keyword(s):

Neural Network ◽

Machine Learning ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Probabilistic Neural Network ◽

Machine Learning Algorithms ◽

Support Vector ◽

Plant Systematics ◽

K Nearest Neighbor ◽

Plant Leaf

Plant systematics can be classified and recognized based on their reproductive system (flowers) and leaf morphology. Neural networks is one of the most popular machine learning algorithms for plant leaf classification. The commonly used neutral networks are artificial neural network (ANN), probabilistic neural network (PNN), convolutional neural network (CNN), k-nearest neighbor (KNN) and support vector machine (SVM), even some studies used combined techniques for accuracy improvement. The utilization of several varying preprocessing techniques, and characteristic parameters in feature extraction appeared to improve the performance of plant leaf classification. The findings of previous studies are critically compared in terms of their accuracy based on the applied neural network techniques. This paper aims to review and analyze the implementation and performance of various methodologies on plant classification. Each technique has its advantages and limitations in leaf pattern recognition. The quality of leaf images plays an important role, and therefore, a reliable source of leaf database must be used to establish the machine learning algorithm prior to leaf recognition and validation.

Download Full-text

Detecting Types of Sleep Apneas Through Nonlinear Features of Electrocardiogram and Electroencephalogram Signals

Frontiers in Biomedical Technologies ◽

10.18502/fbt.v8i3.7115 ◽

2021 ◽

Author(s):

Shaghayegh Saghafi ◽

Fereidoun Nowshiravan Rahatabad ◽

Keivan Maghooli

Keyword(s):

Sleep Apnea ◽

Nearest Neighbor ◽

Principal Component ◽

Majority Voting ◽

Support Vector ◽

Svm Classifier ◽

Common Disease ◽

K Nearest Neighbor ◽

Linear Discriminant ◽

Nonlinear Features

Purpose: Sleep apnea is a common disease among women, and mainly men. The most dangerous complication of this disorder is heart stroke. Other complications include insufficient sleep and resulting daytime tiredness and illness that affect the individual's activities during the day, disrupt their life. Therefore, identifying this disease is important. Materials and Methods: We used Electroencephalogram (EEG) and Electrocardiogram (ECG) channels from the data of 25 patients with sleep apnea, for each type of sleep apnea, 8 nonlinear-like features, including fractal dimension, correlation dimension, certainty, recurrence rate, mean diagonal lines, the entropy of recursive quantification analysis, sample Entropy, and Shannon entropy were extracted. Then, feature matrices were sorted using principal component analysis in the order of linear combination of features, and the 20 selected features were chosen, normalized using common methods, and fed to different classifiers. Two 5-class and 2-class classification methods were assessed. In the 5-classification, three classifiers were used; the support vector machine, k-nearest neighbor, and multilayer perceptron. Results: The results showed that the highest mean validity, accuracy, sensitivity, and specificity for the SVM classifier was 88.45%, 88.35%, 88.33%, and 88.32%, respectively. In the 2-class approach, in addition to the mentioned classifiers, linear discriminant analysis, Bayes, and majority voting were used, and each class was considered against all classes. The highest average validity, average accuracy, average sensitivity, average specificity using the majority rule voting was 94.35%, 94.30%, 94.32%, and 94.15% respectively. Conclusion: When the results of classifiers are combined with the majority voting method, the validity of identifying the classes increases. The average validity for this method was obtained at 94.42%, which was higher than several other studies. It is recommended that databases with a larger sample size be used. This would lead to increased reliability of the proposed analysis method. Moreover, using novel deep-learning-based methods could help obtain better results.

Download Full-text