Real-Time Sentiment Analysis of Twitter Streaming data for Stock Prediction

Machine Learning Models for Stock Prediction Using Real-Time Streaming Data

Learning and Analytics in Intelligent Systems - Biologically Inspired Techniques in Many-Criteria Decision Making ◽

10.1007/978-3-030-39033-4_10 ◽

2020 ◽

pp. 101-108

Author(s):

Monalisa Jena ◽

Ranjan Kumar Behera ◽

Santanu Kumar Rath

Keyword(s):

Machine Learning ◽

Real Time ◽

Streaming Data ◽

Learning Models ◽

Stock Prediction ◽

Machine Learning Models

Download Full-text

Predicting Coronavirus Pandemic in Real-Time Using Machine Learning and Big Data Streaming System

Complexity ◽

10.1155/2020/6688912 ◽

2020 ◽

Vol 2020 ◽

pp. 1-10

Author(s):

Xiongwei Zhang ◽

Hager Saleh ◽

Eman M. G. Younis ◽

Radhya Sahal ◽

Abdelmgeid A. Ali

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Sentiment Analysis ◽

Real Time ◽

Machine Learning Algorithms ◽

Streaming Data ◽

Support Vector ◽

Real Time System ◽

Online Prediction ◽

Analysis Prediction

Twitter is a virtual social network where people share their posts and opinions about the current situation, such as the coronavirus pandemic. It is considered the most significant streaming data source for machine learning research in terms of analysis, prediction, knowledge extraction, and opinions. Sentiment analysis is a text analysis method that has gained further significance due to social networks’ emergence. Therefore, this paper introduces a real-time system for sentiment prediction on Twitter streaming data for tweets about the coronavirus pandemic. The proposed system aims to find the optimal machine learning model that obtains the best performance for coronavirus sentiment analysis prediction and then uses it in real-time. The proposed system has been developed into two components: developing an offline sentiment analysis and modeling an online prediction pipeline. The system has two components: the offline and the online components. For the offline component of the system, the historical tweets’ dataset was collected in duration 23/01/2020 and 01/06/2020 and filtered by #COVID-19 and #Coronavirus hashtags. Two feature extraction methods of textual data analysis were used, n-gram and TF-ID, to extract the dataset’s essential features, collected using coronavirus hashtags. Then, five regular machine learning algorithms were performed and compared: decision tree, logistic regression, k-nearest neighbors, random forest, and support vector machine to select the best model for the online prediction component. The online prediction pipeline was developed using Twitter Streaming API, Apache Kafka, and Apache Spark. The experimental results indicate that the RF model using the unigram feature extraction method has achieved the best performance, and it is used for sentiment prediction on Twitter streaming data for coronavirus.

Download Full-text

Comparative Study of Real Time Machine Learning Models for Stock Prediction through Streaming Data

JUCS - Journal of Universal Computer Science ◽

10.3897/jucs.2020.059 ◽

2020 ◽

Vol 26 (9) ◽

pp. 1128-1147

Author(s):

Ranjan Behera ◽

Sushree Das ◽

Santanu Rath ◽

Sanjay Misra ◽

Robertas Damasevicius

Keyword(s):

Machine Learning ◽

Real Time ◽

Historical Data ◽

Streaming Data ◽

Support Vector ◽

Learning Models ◽

Stock Prediction ◽

The Real ◽

Lambda Architecture ◽

Machine Learning Models

Stock prediction is one of the emerging applications in the field of data science which help the companies to make better decision strategy. Machine learning models play a vital role in the field of prediction. In this paper, we have proposed various machine learning models which predicts the stock price from the real-time streaming data. Streaming data has been a potential source for real-time prediction which deals with continuous ow of data having information from various sources like social networking websites, server logs, mobile phone applications, trading oors etc. We have adopted the distributed platform, Spark to analyze the streaming data collected from two different sources as represented in two case studies in this paper. The first case study is based on stock prediction from the historical data collected from Google finance websites through NodeJs and the second one is based on the sentiment analysis of Twitter collected through Twitter API available in Stanford NLP package. Several researches have been made in developing models for stock prediction based on static data. In this work, an effort has been made to develop scalable, fault tolerant models for stock prediction from the real-time streaming data. The Proposed model is based on a distributed architecture known as Lambda architecture. The extensive comparison is made between actual and predicted output for different machine learning models. Support vector regression is found to have better accuracy as compared to other models. The historical data is considered as a ground truth data for validation.

Download Full-text

Opinion Mining with Real Time Ontology Streaming Data

International Journal of Psychosocial Rehabilitation ◽

10.37200/ijpr/v23i1/pr190244 ◽

2019 ◽

Vol 23 (1) ◽

pp. 346-357

Author(s):

Vithya G ◽

Naren J ◽

Varun V

Keyword(s):

Real Time ◽

Opinion Mining ◽

Streaming Data

Download Full-text

DAViS: a unified solution for data collection, analyzation, and visualization in real-time stock market prediction

Financial Innovation ◽

10.1186/s40854-021-00269-7 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Suppawong Tuarob ◽

Poom Wettayakorn ◽

Ponpat Phetchai ◽

Siripong Traivijitkhun ◽

Sunghoon Lim ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Data Collection ◽

Stock Market ◽

Real Time ◽

Stock Prices ◽

Contextual Information ◽

Online News ◽

Stock Prediction ◽

Stock Market Prediction

AbstractThe explosion of online information with the recent advent of digital technology in information processing, information storing, information sharing, natural language processing, and text mining techniques has enabled stock investors to uncover market movement and volatility from heterogeneous content. For example, a typical stock market investor reads the news, explores market sentiment, and analyzes technical details in order to make a sound decision prior to purchasing or selling a particular company’s stock. However, capturing a dynamic stock market trend is challenging owing to high fluctuation and the non-stationary nature of the stock market. Although existing studies have attempted to enhance stock prediction, few have provided a complete decision-support system for investors to retrieve real-time data from multiple sources and extract insightful information for sound decision-making. To address the above challenge, we propose a unified solution for data collection, analysis, and visualization in real-time stock market prediction to retrieve and process relevant financial data from news articles, social media, and company technical information. We aim to provide not only useful information for stock investors but also meaningful visualization that enables investors to effectively interpret storyline events affecting stock prices. Specifically, we utilize an ensemble stacking of diversified machine-learning-based estimators and innovative contextual feature engineering to predict the next day’s stock prices. Experiment results show that our proposed stock forecasting method outperforms a traditional baseline with an average mean absolute percentage error of 0.93. Our findings confirm that leveraging an ensemble scheme of machine learning methods with contextual information improves stock prediction performance. Finally, our study could be further extended to a wide variety of innovative financial applications that seek to incorporate external insight from contextual information such as large-scale online news articles and social media data.

Download Full-text

Stream Data Load Prediction for Resource Scaling Using Online Support Vector Regression

Algorithms ◽

10.3390/a12020037 ◽

2019 ◽

Vol 12 (2) ◽

pp. 37 ◽

Cited By ~ 3

Author(s):

Zhigang Hu ◽

Hui Kang ◽

Meiguang Zheng

Keyword(s):

Support Vector Regression ◽

Real Time ◽

Virtual Machines ◽

Time Window ◽

Performance Model ◽

Streaming Data ◽

Support Vector ◽

Load Prediction ◽

Stream Data ◽

Online Support Vector Regression

A distributed data stream processing system handles real-time, changeable and sudden streaming data load. Its elastic resource allocation has become a fundamental and challenging problem with a fixed strategy that will result in waste of resources or a reduction in QoS (quality of service). Spark Streaming as an emerging system has been developed to process real time stream data analytics by using micro-batch approach. In this paper, first, we propose an improved SVR (support vector regression) based stream data load prediction scheme. Then, we design a spark-based maximum sustainable throughput of time window (MSTW) performance model to find the optimized number of virtual machines. Finally, we present a resource scaling algorithm TWRES (time window resource elasticity scaling algorithm) with MSTW constraint and streaming data load prediction. The evaluation results show that TWRES could improve resource utilization and mitigate SLA (service level agreement) violation.

Download Full-text

Study of data locality for real-time biomedical signal processing of streaming data on Cell Broadband Engine

Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon) ◽

10.1109/secon.2010.5453907 ◽

2010 ◽

Cited By ~ 3

Author(s):

Ashish Panday ◽

Bharat Joshi ◽

Arun Ravindran ◽

Jongho Byun ◽

Hitten Zaveri

Keyword(s):

Signal Processing ◽

Real Time ◽

Data Locality ◽

Streaming Data ◽

Biomedical Signal Processing ◽

Biomedical Signal ◽

Cell Broadband Engine

Download Full-text

Innovative Web Service Grammar for Streaming Computing in Real-Time Technology

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b7306.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 1589-1595

Keyword(s):

Real Time ◽

Research Work ◽

Parallel Flow ◽

Streaming Data ◽

Advanced Technologies ◽

Query Result ◽

Different Types ◽

Pros And Cons ◽

Event Flow ◽

Streaming Computing

The purpose of this work is to develop a UJSON web technology with C# application to analyze the student data in real-ime. Execute continuous requests on JSON streaming data based on advanced technologies for parallel streaming computing, suitable for solving analytic problems and calculation of metrics in real-time. The developed management information system in this research work designed to filtering event flow, building an event flow as a query result, grouping and aggregation of events, and creating window semantics. For testing the proposed work, several queries were selected that implement aggregation with different types of semantic windows (Steps, Slides). Testing was done locally and on education moodle clusters. It was used 4 types of configurations 2, 4, 8, and 16 computing nodes. Based on the obtained results, scalability is noticeable with an increase in the number of nodes. The updated functions of the proposed UJSON could improve the construction of parallel flow systems and data processing. The developed approach based on modern and advanced parallel flow technologies for output calculations considering the pros and cons of various approaches found in the current era.

Download Full-text

Real time Sentiment Analysis of Tweets using Apache Spark and Scala

ACS Journal for Science and Engineering ◽

10.34293/acsjse.v1i2.9 ◽

2021 ◽

Vol 1 (2) ◽

pp. 9-15

Author(s):

V Mareeswari ◽

Sunita S Patil ◽

Ramanan G

Keyword(s):

Sentiment Analysis ◽

Real Time ◽

Ad Hoc ◽

Apache Spark ◽

Data Streaming ◽

Real Time Processing ◽

Open Source Data ◽

Textual Data ◽

Bayes Algorithm ◽

Processing Platform

Sentiment Analysis is becoming the field of focus with time considering the user experience weighs much more for the business to grow and for the studies as well. The sentimental expressions refers to the emotions or feeling of a person across certain point of focus or issues. So, in this project, with the assistance of Apache Spark Framework, an open source data streaming and processing platform, sentiment evaluation is done on the tweets from Twitter by the means of real time processing as well as an Ad-hoc Run. Some preprocessing of the textual data has been done upon for better characteristics extraction thus resulting in greater accuracy. The validation of this has been done for achieving better result by comparing the other processes when Naive Bayes algorithm is used.

Download Full-text

Sentiment Analysis on Weibo Platform for Stock Prediction

Communications in Computer and Information Science - Artificial Intelligence and Security ◽

10.1007/978-981-15-8083-3_29 ◽

2020 ◽

pp. 323-333

Author(s):

Wanting Zhao ◽

Fan Wu ◽

Zhongqi Fu ◽

Zesen Wang ◽

Xiaoqi Zhang

Keyword(s):

Sentiment Analysis ◽

Stock Prediction

Download Full-text