Combining Sentiment Analysis with Socialization Bias in Social Networks for Stock Market Trend Prediction

Author(s):  
Jiajia Li ◽  
Phayung Meesad

According to the indirect relationship between information and stock trend, information such as comments and tweets can be used for stock trend prediction. When conducting classification on text data, feature sparse issues occur during conversion between tweets and word vectors. Another problem is that the unreliability of average sentiment scores to indicate one day’s sentiment. This is especially caused by the unbalanced number between positive and negative within one day, thus a large bias between sentiment and stock trend arises. In addion, information has social attributes when created and diffused in social networks, bias containing people’s belief in social networks also have become socialization bias. In order to solve those problems, this work proposes a sentiment analysis based prediction model and an inverse bias algorithm. Instead of applying sentiment analysis to add sentiment related features, this work uses SentiWordNet to give an additional weight to the selected features, and applies two kinds of sentiment analysis to inverse the socialization bias. Aiming at labeling the tweets to sentiment related groups to help find socialization bias, this work also proposes an extended wordlist based on a semi-supervised Naïve Bayes classification algorithm. After finishing the inverse socialization bias, stock trends were used to label example sets. Different classification algorithms were compared in this work. The proposed model with SVM linear algorithm proves to yield accuracy of 90.33% at its best performance.

2018 ◽  
Vol 45 (6) ◽  
pp. 736-755 ◽  
Author(s):  
Ayoub Bagheri

A crucial task in sentiment analysis is aspect detection: the step of selecting the aspects on which opinions are expressed. This step anticipates the step of determining whether the opinions on aspects are positive or negative. This article proposes a novel probabilistic generative topic model for aspect-based sentiment analysis which is able to discover the latent structure of a large collection of review documents. The proposed joint sentiment-aspect detection model (SAM) is a generative topic model that incorporates the structure of review sentences for detecting aspects and sentiments simultaneously. The intuitions behind the SAM are that from generating documents by latent single- and multi-word topics, modelling the word distribution for each topic and learning of the prior distribution over topics in sentences of documents. SAM introduces word status so that the model can decide when to sample from a bigram distribution or a unigram distribution and integrates all these components into one combined model for aspect-based sentiment analysis. We evaluate SAM both qualitatively and quantitatively to show that the model is indeed able to perform the task effectively and improves significantly over standard joint sentiment-aspect models. The proposed model can easily be transformed between domains or languages and can detect the polarity of text data at various levels. However, for the quantitative analysis, we mainly focus on presenting the results for the document-level sentiment classification.


2021 ◽  
Vol 11 (22) ◽  
pp. 10774
Author(s):  
Hongchan Li ◽  
Yu Ma ◽  
Zishuai Ma ◽  
Haodong Zhu

With the rapid increase of public opinion data, the technology of Weibo text sentiment analysis plays a more and more significant role in monitoring network public opinion. Due to the sparseness and high-dimensionality of text data and the complex semantics of natural language, sentiment analysis tasks face tremendous challenges. To solve the above problems, this paper proposes a new model based on BERT and deep learning for Weibo text sentiment analysis. Specifically, first using BERT to represent the text with dynamic word vectors and using the processed sentiment dictionary to enhance the sentiment features of the vectors; then adopting the BiLSTM to extract the contextual features of the text, the processed vector representation is weighted by the attention mechanism. After weighting, using the CNN to extract the important local sentiment features in the text, finally the processed sentiment feature representation is classified. A comparative experiment was conducted on the Weibo text dataset collected during the COVID-19 epidemic; the results showed that the performance of the proposed model was significantly improved compared with other similar models.


2021 ◽  
pp. 1-17
Author(s):  
M. Mohamed Iqbal ◽  
K. Latha

Link prediction plays a predominant role in complex network analysis. It indicates to determine the probability of the presence of future links that depends on available information. The existing standard classical similarity indices-based link prediction models considered the neighbour nodes have a similar effect towards link probability. Nevertheless, the common neighbor nodes residing in different communities may vary in real-world networks. In this paper, a novel community information-based link prediction model has been proposed in which every neighboring node’s community information (community centrality) has been considered to predict the link between the given node pair. In the proposed model, the given social network graph can be divided into different communities and community centrality is calculated for every derived community based on degree, closeness, and betweenness basic graph centrality measures. Afterward, the new community centrality-based similarity indices have been introduced to compute the community centralities which are applied to nine existing basic similarity indices. The empirical analysis on 13 real-world social networks datasets manifests that the proposed model yields better prediction accuracy of 97% rather than existing models. Moreover, the proposed model is parallelized efficiently to work on large complex networks using Spark GraphX Big Data-based parallel Graph processing technique and it attains a lesser execution time of 250 seconds.


2020 ◽  
Author(s):  
Pathikkumar Patel ◽  
Bhargav Lad ◽  
Jinan Fiaidhi

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.


2019 ◽  
Vol 13 (1) ◽  
pp. 20-27 ◽  
Author(s):  
Srishty Jindal ◽  
Kamlesh Sharma

Background: With the tremendous increase in the use of social networking sites for sharing the emotions, views, preferences etc. a huge volume of data and text is available on the internet, there comes the need for understanding the text and analysing the data to determine the exact intent behind the same for a greater good. This process of understanding the text and data involves loads of analytical methods, several phases and multiple techniques. Efficient use of these techniques is important for an effective and relevant understanding of the text/data. This analysis can in turn be very helpful in ecommerce for targeting audience, social media monitoring for anticipating the foul elements from society and take proactive actions to avoid unethical and illegal activities, business analytics, market positioning etc. Method: The goal is to understand the basic steps involved in analysing the text data which can be helpful in determining sentiments behind them. This review provides detailed description of steps involved in sentiment analysis with the recent research done. Patents related to sentiment analysis and classification are reviewed to throw some light in the work done related to the field. Results: Sentiment analysis determines the polarity behind the text data/review. This analysis helps in increasing the business revenue, e-health, or determining the behaviour of a person. Conclusion: This study helps in understanding the basic steps involved in natural language understanding. At each step there are multiple techniques that can be applied on data. Different classifiers provide variable accuracy depending upon the data set and classification technique used.


2021 ◽  
pp. 1-17
Author(s):  
J. Shobana ◽  
M. Murali

Text Sentiment analysis is the process of predicting whether a segment of text has opinionated or objective content and analyzing the polarity of the text’s sentiment. Understanding the needs and behavior of the target customer plays a vital role in the success of the business so the sentiment analysis process would help the marketer to improve the quality of the product as well as a shopper to buy the correct product. Due to its automatic learning capability, deep learning is the current research interest in Natural language processing. Skip-gram architecture is used in the proposed model for better extraction of the semantic relationships as well as contextual information of words. However, the main contribution of this work is Adaptive Particle Swarm Optimization (APSO) algorithm based LSTM for sentiment analysis. LSTM is used in the proposed model for understanding complex patterns in textual data. To improve the performance of the LSTM, weight parameters are enhanced by presenting the Adaptive PSO algorithm. Opposition based learning (OBL) method combined with PSO algorithm becomes the Adaptive Particle Swarm Optimization (APSO) classifier which assists LSTM in selecting optimal weight for the environment in less number of iterations. So APSO - LSTM ‘s ability in adjusting the attributes such as optimal weights and learning rates combined with the good hyper parameter choices leads to improved accuracy and reduces losses. Extensive experiments were conducted on four datasets proved that our proposed APSO-LSTM model secured higher accuracy over the classical methods such as traditional LSTM, ANN, and SVM. According to simulation results, the proposed model is outperforming other existing models.


Agronomy ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 1307
Author(s):  
Haoriqin Wang ◽  
Huaji Zhu ◽  
Huarui Wu ◽  
Xiaomin Wang ◽  
Xiao Han ◽  
...  

In the question-and-answer (Q&A) communities of the “China Agricultural Technology Extension Information Platform”, thousands of rice-related Chinese questions are newly added every day. The rapid detection of the same semantic question is the key to the success of a rice-related intelligent Q&A system. To allow the fast and automatic detection of the same semantic rice-related questions, we propose a new method based on the Coattention-DenseGRU (Gated Recurrent Unit). According to the rice-related question characteristics, we applied word2vec with the TF-IDF (Term Frequency–Inverse Document Frequency) method to process and analyze the text data and compare it with the Word2vec, GloVe, and TF-IDF methods. Combined with the agricultural word segmentation dictionary, we applied Word2vec with the TF-IDF method, effectively solving the problem of high dimension and sparse data in the rice-related text. Each network layer employed the connection information of features and all previous recursive layers’ hidden features. To alleviate the problem of feature vector size increasing due to dense splicing, an autoencoder was used after dense concatenation. The experimental results show that rice-related question similarity matching based on Coattention-DenseGRU can improve the utilization of text features, reduce the loss of features, and achieve fast and accurate similarity matching of the rice-related question dataset. The precision and F1 values of the proposed model were 96.3% and 96.9%, respectively. Compared with seven other kinds of question similarity matching models, we present a new state-of-the-art method with our rice-related question dataset.


2021 ◽  
pp. 1-13
Author(s):  
C S Pavan Kumar ◽  
L D Dhinesh Babu

Sentiment analysis is widely used to retrieve the hidden sentiments in medical discussions over Online Social Networking platforms such as Twitter, Facebook, Instagram. People often tend to convey their feelings concerning their medical problems over social media platforms. Practitioners and health care workers have started to observe these discussions to assess the impact of health-related issues among the people. This helps in providing better care to improve the quality of life. Dementia is a serious disease in western countries like the United States of America and the United Kingdom, and the respective governments are providing facilities to the affected people. There is much chatter over social media platforms concerning the patients’ care, healthy measures to be followed to avoid disease, check early indications. These chatters have to be carefully monitored to help the officials take necessary precautions for the betterment of the affected. A novel Feature engineering architecture that involves feature-split for sentiment analysis of medical chatter over online social networks with the pipeline is proposed that can be used on any Machine Learning model. The proposed model used the fuzzy membership function in refining the outputs. The machine learning model has obtained sentiment score is subjected to fuzzification and defuzzification by using the trapezoid membership function and center of sums method, respectively. Three datasets are considered for comparison of the proposed and the regular model. The proposed approach delivered better results than the normal approach and is proved to be an effective approach for sentiment analysis of medical discussions over online social networks.


2013 ◽  
Vol 427-429 ◽  
pp. 2614-2617
Author(s):  
Qing Xi Peng

Online reviews as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. Although supersized methods have obtained good results, a large amount of corpus should be trained beforehand. Recently, topic models have been introduced for the simultaneous analysis for sentiment in the document. However, the LDA model makes the assumption that, given the parameters the words in the document are all independent. It obviously isnt the case. The words in the document express the sentiment of the author. This paper proposes a model to solve the problem. We assume that the sentiments are related to the topic in the documents. A sentiment layer is added to the LDA model to improve it. Experimental result in the dataset demonstrates the advantage of the proposed model.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Mingli Wang ◽  
Huikuan Gu ◽  
Jiang Hu ◽  
Jian Liang ◽  
Sisi Xu ◽  
...  

Abstract Background and purpose To explore whether a highly refined dose volume histograms (DVH) prediction model can improve the accuracy and reliability of knowledge-based volumetric modulated arc therapy (VMAT) planning for cervical cancer. Methods and materials The proposed model underwent repeated refining through progressive training until the training samples increased from initial 25 prior plans up to 100 cases. The estimated DVHs derived from the prediction models of different runs of training were compared in 35 new cervical cancer patients to analyze the effect of such an interactive plan and model evolution method. The reliability and efficiency of knowledge-based planning (KBP) using this highly refined model in improving the consistency and quality of the VMAT plans were also evaluated. Results The prediction ability was reinforced with the increased number of refinements in terms of normal tissue sparing. With enhanced prediction accuracy, more than 60% of automatic plan-6 (AP-6) plans (22/35) can be directly approved for clinical treatment without any manual revision. The plan quality scores for clinically approved plans (CPs) and manual plans (MPs) were on average 89.02 ± 4.83 and 86.48 ± 3.92 (p < 0.001). Knowledge-based planning significantly reduced the Dmean and V18 Gy for kidney (L/R), the Dmean, V30 Gy, and V40 Gy for bladder, rectum, and femoral head (L/R). Conclusion The proposed model evolution method provides a practical way for the KBP to enhance its prediction ability with minimal human intervene. This highly refined prediction model can better guide KBP in improving the consistency and quality of the VMAT plans.


Sign in / Sign up

Export Citation Format

Share Document