Forecasting Oil Price Trends with Sentiment of Online News Articles

2017 ◽  
Vol 34 (02) ◽  
pp. 1740019 ◽  
Author(s):  
Jian Li ◽  
Zhenjing Xu ◽  
Huijuan Xu ◽  
Ling Tang ◽  
Lean Yu

With the rapid development of the Internet and big data technologies, a rich of online data (including news releases) can helpfully facilitate forecasting oil price trends. Accordingly, this study introduces sentiment analysis, a useful big data analysis tool, to understand the relevant information of online news articles and formulate an oil price trend prediction method with sentiment. Three main steps are included in the proposed method, i.e., sentiment analysis, relationship investigation and trend prediction. In sentiment analysis, the sentiment (or tone) is extracted based on a dictionary-based approach to capture the relevant online information concerning oil markets and the driving factors. In relationship investigation, the Granger causality analysis is conducted to explore whether and how the sentiment impacts oil price. In trend prediction, the sentiment is used as an important independent variable, and some popular forecasting models, e.g., logistic regression, support vector machine, decision tree and back propagation neural network, are performed. With crude oil futures prices of the West Texas Intermediate (WTI) and news articles of the Thomson Reuters as studying samples, the empirical results statistically support the powerful predictive power of sentiment for oil price trends and hence the effectiveness of the proposed method.

2018 ◽  
Vol 34 (3) ◽  
pp. 569-581 ◽  
Author(s):  
Sujata Rani ◽  
Parteek Kumar

Abstract In this article, an innovative approach to perform the sentiment analysis (SA) has been presented. The proposed system handles the issues of Romanized or abbreviated text and spelling variations in the text to perform the sentiment analysis. The training data set of 3,000 movie reviews and tweets has been manually labeled by native speakers of Hindi in three classes, i.e. positive, negative, and neutral. The system uses WEKA (Waikato Environment for Knowledge Analysis) tool to convert these string data into numerical matrices and applies three machine learning techniques, i.e. Naive Bayes (NB), J48, and support vector machine (SVM). The proposed system has been tested on 100 movie reviews and tweets, and it has been observed that SVM has performed best in comparison to other classifiers, and it has an accuracy of 68% for movie reviews and 82% in case of tweets. The results of the proposed system are very promising and can be used in emerging applications like SA of product reviews and social media analysis. Additionally, the proposed system can be used in other cultural/social benefits like predicting/fighting human riots.


2020 ◽  
Vol 17 (3) ◽  
pp. 39-55
Author(s):  
Chuanmin Mi ◽  
Xiaoyan Ruan ◽  
Lin Xiao

With the rapid development of information technology, microblog sentiment analysis (MSA) has become a popular research topic extensively examined in the literature. Microblogging messages are usually short, unstructured, contain less information, creating a significant challenge for the application of traditional content-based methods. In this study, the authors propose a novel method, MSA-USSR, in which user similarity information and interaction-based social relations information are combined to build sentiment relationships between microblogging data. They make use of these microblog–microblog sentiment relations to train the sentiment polarity classification classifier. Two Sina-Weibo datasets were utilized to verify the proposed model. The experimental results show that the proposed method has a better sentiment classification accuracy and F1-score than the content-based support vector machine (SVM) method and the state-of-the-art supervised model known as SANT.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Babacar Gaye ◽  
Dezheng Zhang ◽  
Aziguli Wulamu

With the rapid development of the Internet and the rapid development of big data analysis technology, data mining has played a positive role in promoting industry and academia. Classification is an important problem in data mining. This paper explores the background and theory of support vector machines (SVM) in data mining classification algorithms and analyzes and summarizes the research status of various improved methods of SVM. According to the scale and characteristics of the data, different solution spaces are selected, and the solution of the dual problem is transformed into the classification surface of the original space to improve the algorithm speed. Research Process. Incorporating fuzzy membership into multicore learning, it is found that the time complexity of the original problem is determined by the dimension, and the time complexity of the dual problem is determined by the quantity, and the dimension and quantity constitute the scale of the data, so it can be based on the scale of the data Features Choose different solution spaces. The algorithm speed can be improved by transforming the solution of the dual problem into the classification surface of the original space. Conclusion. By improving the calculation rate of traditional machine learning algorithms, it is concluded that the accuracy of the fitting prediction between the predicted data and the actual value is as high as 98%, which can make the traditional machine learning algorithm meet the requirements of the big data era. It can be widely used in the context of big data.


The main objective of this paper is Analyze the reviews of Social Media Big Data of E-Commerce product’s. And provides helpful result to online shopping customers about the product quality and also provides helpful decision making idea to the business about the customer’s mostly liking and buying products. This covers all features or opinion words, like capitalized words, sequence of repeated letters, emoji, slang words, exclamatory words, intensifiers, modifiers, conjunction words and negation words etc available in tweets. The existing work has considered only two or three features to perform Sentiment Analysis with the machine learning technique Natural Language Processing (NLP). In this proposed work familiar Machine Learning classification models namely Multinomial Naïve Bayes, Support Vector Machine, Decision Tree Classifier, and, Random Forest Classifier are used for sentiment classification. The sentiment classification is used as a decision support system for the customers and also for the business.


2020 ◽  
Vol 39 (4) ◽  
pp. 5635-5647
Author(s):  
Xueling Nie

Big data has the characteristics of rapid data flow, massive data scale, dynamic data system, and various data types, and it has become increasingly apparent in improving innovation and entrepreneurship data analysis, trend prediction, and decision support. In this paper, the authors analyze the economic function data and entrepreneurship analysis based on machine learning. The support vector pair is very sensitive to the choice of parameters, and the parameters obtained using the genetic algorithm will greatly improve the accuracy of the model prediction. When using the genetic algorithm to find parameters, the cv method is used for verification. By applying big data technologies and platforms, it can provide strong data support to establish entrepreneurship education; integrate and integrate various types of innovation and entrepreneurship data, improve the quality of data collection.At the same time, through big data mining and analysis, accurately determine market demand hotspots and innovation and entrepreneurship trends, and promote scientific planning of innovation and entrepreneurship strategies. The research results show that this research model can be applied to actual projects in the future, and help investors better understand the changes of market economy.


Author(s):  
Haifeng Hu ◽  
Junhui Zheng

With the rapid development of China's economy in recent years, the scale of students has expanded gradually, which has led to many new problems, including the problems of the quality and the quantity of teachers, and the teaching facilities being insufficient. The assessment of teaching quality is one of the most important aspects of teaching management, which come to the attention of every university. Therefore, it has become the current focus in the research of university teaching. At the same time, the traditional method of teaching quality assessment has not been able to deal with the phenomenon of big data in the field of education. As a new technology, cloud computing provides a broad space for the development of a new model in the aspects of hardware environment construction, software resource development, network teaching implementation and personal knowledge management. In order to effectively deal with the challenges of big data processing in the field of education, this paper proposes a GA-SVM teaching quality assessment algorithm which is based on MapReduce. Through the design of a map function and reduce function, this paper realizes the parallelization of the GA-SVM algorithm and the selection of the main parameters. Secondly, this paper uses a genetic algorithm to optimize the penalty coefficient and kernel parameters of SVM, and then solves the problem of difficulty in determining the parameters of support vectors. In addition, we improve the sensitivity of the search through the method of logarithmic transformation, and speed up the convergence rate of the GA model. Finally, we compare the parallel algorithm and the serial algorithm on the Hadoop platform. The results of experiments show that the GA-SVM based on MapReduce is suitable for teaching quality assessment under the environment of big data.


Author(s):  
Mohammed Ibrahim Al-mashhadani ◽  
Kilan M. Hussein ◽  
Enas Tariq Khudir ◽  
Muhammad ilyas

Now days, in many real life applications, the sentiment analysis plays very vital role for automatic prediction of human being activities especially on online social networks (OSNs). Therefore since from last decade, the research on opinion mining and sentiment analysis is growing with increasing volume of online reviews available over the social media networks like Facebook OSNs. Sentiment analysis falls under the data mining domain research problem. Sentiment analysis is kind of text mining process used to determine the subjective attitude like sentiment from the written texts and hence becoming the main research interest in domain of natural language processing and data mining. The main task in sentiment analysis is classifying human sentiment with objective of classifying the sentiment or emotion of end users for their specific text on OSNs. There are number of research methods designed already for sentiment analysis. There are many factors like accuracy, efficiency, speed etc. used to evaluate the effectiveness of sentiment analysis methods. The MapReduce framework under the domain of big-data is used to minimize the speed of execution and efficiency recently with many data mining methods. The sentiment analysis for Facebook OSNs messages is very challenging tasks as compared to other sentiment analysis because of misspellings and slang words presence in twitter dataset. In this paper, different solutions recently presented are discussed in detail. Then proposed the new approach for sentiment analysis based on hybrid features extraction methods and multi-class Support Vector Machine (SVM). These algorithms are designed using the Big-data techniques to optimize the performance of sentiment analysis


Big Data ◽  
2016 ◽  
pp. 1309-1325
Author(s):  
Andrew Lukyamuzi ◽  
John Ngubiri ◽  
Washington Okori

Food insecurity is a global challenge affecting millions of people especially those from least developed regions. Famine predictions are being carried out to estimate when shortage of food is most likely to happen. The traditional data sets such as house hold information, price trends, crop production trends and biophysical data used for predicting food insecurity are both labor intensive and expensive to acquire. Current trends are towards harnessing big data to study various phenomena such sentiment analysis and stock markets. Big data is said to be easier to obtain than traditional datasets. This study shows that phone messages archives and telephone conversations as big datasets are potential for predicting food crisis. This is timely with the current situation of massive penetration of mobile technology and the necessary data can be gathered to foster studies such as this. Computation techniques such as Naïve Bayes, Artificial Networks and Support Vector Machines are prospective candidates in this strategy. If the strategy is to work in a nation like Uganda, areas of concern have been highlighted. Future work points at exploring this approach experimentally.


Sign in / Sign up

Export Citation Format

Share Document