scholarly journals redBERT: A Topic Discovery and Deep Sentiment Classification Model on COVID-19 Online Discussions Using BERT NLP Model

Author(s):  
Chaitanya Pandey

A Natural Language Processing (NLP) method was used to uncover various issues and sentiments surrounding COVID-19 from social media and get a deeper understanding of fluctuating public opinion in situations of wide-scale panic to guide improved decision making with the help of a sentiment analyser created for the automated extraction of COVID-19 related discussions based on topic modelling. Moreover, the BERT model was used for the sentiment classification of COVID-19 Reddit comments. These findings shed light on the importance of studying trends and using computational techniques to assess human psyche in times of distress.

2021 ◽  
Vol 12 (3) ◽  
pp. 32-47
Author(s):  
Chaitanya Pandey

A natural language processing (NLP) method was used to uncover various issues and sentiments surrounding COVID-19 from social media and get a deeper understanding of fluctuating public opinion in situations of wide-scale panic to guide improved decision making with the help of a sentiment analyser created for the automated extraction of COVID-19-related discussions based on topic modelling. Moreover, the BERT model was used for the sentiment classification of COVID-19 Reddit comments. These findings shed light on the importance of studying trends and using computational techniques to assess the human psyche in times of distress.


2021 ◽  
Vol 297 ◽  
pp. 01071
Author(s):  
Sifi Fatima-Zahrae ◽  
Sabbar Wafae ◽  
El Mzabi Amal

Sentiment classification is one of the hottest research areas among the Natural Language Processing (NLP) topics. While it aims to detect sentiment polarity and classification of the given opinion, requires a large number of aspect extractions. However, extracting aspect takes human effort and long time. To reduce this, Latent Dirichlet Allocation (LDA) method have come out recently to deal with this issue.In this paper, an efficient preprocessing method for sentiment classification is presented and will be used for analyzing user’s comments on Twitter social network. For this purpose, different text preprocessing techniques have been used on the dataset to achieve an acceptable standard text. Latent Dirichlet Allocation has been applied on the obtained data after this fast and accurate preprocessing phase. The implementation of different sentiment analysis methods and the results of these implementations have been compared and evaluated. The experimental results show that the combined uses of the preprocessing method of this paper and Latent Dirichlet Allocation have an acceptable results compared to other basic methods.


2020 ◽  
Vol 10 (14) ◽  
pp. 4711 ◽  
Author(s):  
Zongmin Li ◽  
Qi Zhang ◽  
Yuhong Wang ◽  
Shihang Wang

One prominent dark side of online information behavior is the spreading of rumors. The feature analysis and crowd identification of social media rumor refuters based on machine learning methods can shed light on the rumor refutation process. This paper analyzed the association between user features and rumor refuting behavior in five main rumor categories: economics, society, disaster, politics, and military. Natural language processing (NLP) techniques are applied to quantify the user’s sentiment tendency and recent interests. Then, those results were combined with other personalized features to train an XGBoost classification model, and potential refuters can be identified. Information from 58,807 Sina Weibo users (including their 646,877 microblogs) for the five anti-rumor microblog categories was collected for model training and feature analysis. The results revealed that there were significant differences between rumor stiflers and refuters, as well as between refuters for different categories. Refuters tended to be more active on social media and a large proportion of them gathered in more developed regions. Tweeting history was a vital reference as well, and refuters showed higher interest in topics related with the rumor refuting message. Meanwhile, features such as gender, age, user labels and sentiment tendency also varied between refuters considering categories.


Sentiment Classification is one of the well-known and most popular domain of machine learning and natural language processing. An algorithm is developed to understand the opinion of an entity similar to human beings. This research fining article presents a similar to the mention above. Concept of natural language processing is considered for text representation. Later novel word embedding model is proposed for effective classification of the data. Tf-IDF and Common BoW representation models were considered for representation of text data. Importance of these models are discussed in the respective sections. The proposed is testing using IMDB datasets. 50% training and 50% testing with three random shuffling of the datasets are used for evaluation of the model.


Author(s):  
Hamed Jelodar ◽  
Yongli Wang ◽  
Rita Orji ◽  
Hucheng Huang

AbstractInternet forums and public social media, such as online healthcare forums, provide a convenient channel for users (people/patients) concerned about health issues to discuss and share information with each other. In late December 2019, an outbreak of a novel coronavirus (infection from which results in the disease named COVID-19) was reported, and, due to the rapid spread of the virus in other parts of the world, the World Health Organization declared a state of emergency. In this paper, we used automated extraction of COVID-19–related discussions from social media and a natural language process (NLP) method based on topic modeling to uncover various issues related to COVID-19 from public opinions. Moreover, we also investigate how to use LSTM recurrent neural network for sentiment classification of COVID-19 comments. Our findings shed light on the importance of using public opinions and suitable computational techniques to understand issues surrounding COVID-19 and to guide related decision-making.


2020 ◽  
pp. 1-11
Author(s):  
Hailong Yu ◽  
Yannan Ji ◽  
Qinglin Li

Due to the diversity of text expressions, the text sentiment classification algorithm based on semantic understanding is difficult to establish a perfect sentiment dictionary and sentence matching template, which leads to strong limitations of the algorithm. In particular, it has certain difficulties in the classification of student sentiments. Based on this, this paper analyzes the student sentiment classification model by neural network algorithm and uses the student group as an example to explore the application of neural network model in sentiment classification. Moreover, the regularization method is added to the loss function of LSTM so that the output at any time is related to the output at the previous time. In addition, the sentimental drift distribution of sentimental words on each sentimental label is added to the regularizer, and the sentimental information is merged with the two-way LSTM to allow the model to choose forward or reverse. Finally, in order to verify the research model, the performance of the model proposed in this paper is studied through experimental research. The research shows that the model proposed in this paper has better comprehensive performance than the traditional model and can meet the actual needs of students’ sentiment classification.


2020 ◽  
Vol 309 ◽  
pp. 03015
Author(s):  
Wenbin Liu ◽  
Bojian Wen ◽  
Shang Gao ◽  
Jiesheng Zheng ◽  
Yinlong Zheng

Text classification is a common application in natural language processing. We proposed a multi-label text classification model based on ELMo and attention mechanism which help solve the problem for the sentiment classification task that there is no grammar or writing convention in power supply related text and the sentiment related information disperses in the text. Firstly, we use pre-trained word embedding vector to extract the feature of text from the Internet. Secondly, the analyzed deep information features are weighted according to the attention mechanism. Finally, an improved ELMo model in which we replace the LSTM module with GRU module is used to characterize the text and information is classified. The experimental results on Kaggle’s toxic comment classification data set show that the accuracy of sentiment classification is as high as 98%.


2019 ◽  
Vol 13 ◽  
pp. 174830261984576
Author(s):  
Ningjia Qiu ◽  
Zhuorui Shen ◽  
Xiaojuan Hu ◽  
Peng Wang

Memory limitation and slow training speed are two important problems in sentiment analysis. In this paper, we propose a sentiment classification model based on online learning to improve the training speed of the sentiment classification. First, combining the adaptive adjustment of learning rate of the Adadelta algorithm and the characteristics of avoid frequent jitter of Adam algorithm in the later stage of training, we present a novel Adamdelta algorithm. It solves the problem that learning rate of traditional follow the regularized leader (FTRL)-Proximal online learning algorithm will disappear with the increase of training times. Moreover, we gain an optimized logistic regression (LR) model and use it to the sentiment classification of online learning. Finally, we compare the proposed algorithm with five similar models with the experimental data of the IMDb movie review dataset. Experimental results show that the improved algorithm has better classification effect and can effectively improve the precision and recall of the classifier.


2021 ◽  
Vol 11 (20) ◽  
pp. 9689
Author(s):  
Yerka Freire-Vidal ◽  
Eduardo Graells-Garrido ◽  
Francisco Rowe

Understanding public opinion towards immigrants is key to prevent acts of violence, discrimination and abuse. Traditional data sources, such as surveys, provide rich insights into the formation of such attitudes; yet, they are costly and offer limited temporal granularity, providing only a partial understanding of the dynamics of attitudes towards immigrants. Leveraging Twitter data and natural language processing, we propose a framework to measure attitudes towards immigration in online discussions. Grounded in theories of social psychology, the proposed framework enables the classification of users’ into profile stances of positive and negative attitudes towards immigrants and characterisation of these profiles quantitatively summarising users’ content and temporal stance trends. We use a Twitter sample composed of 36 K users and 160 K tweets discussing the topic in 2017, when the immigrant population in the country recorded an increase by a factor of four from 2010. We found that the negative attitude group of users is smaller than the positive group, and that both attitudes have different distributions of the volume of content. Both types of attitudes show fluctuations over time that seem to be influenced by news events related to immigration. Accounts with negative attitudes use arguments of labour competition and stricter regulation of immigration. In contrast, accounts with positive attitudes reflect arguments in support of immigrants’ human and civil rights. The framework and its application can inform policy makers about how people feel about immigration, with possible implications for policy communication and the design of interventions to improve negative attitudes.


Author(s):  
Jianzhou Feng ◽  
Jinman Cui ◽  
Qikai Wei ◽  
Zhengji Zhou ◽  
Yuxiong Wang

AbstractText classification is a research hotspot in the field of natural language processing. Existing text classification models based on supervised learning, especially deep learning models, have made great progress on public datasets. But most of these methods rely on a large amount of training data, and these datasets coverage is limited. In the legal intelligent question-answering system, accurate classification of legal consulting questions is a necessary prerequisite for the realization of intelligent question answering. However, due to lack of sufficient annotation data and the cost of labeling is high, which lead to the poor effect of traditional supervised learning methods under sparse labeling. In response to the above problems, we construct a few-shot legal consulting questions dataset, and propose a prototypical networks model based on multi-attention. For the same category of instances, this model first highlights the key features in the instances as much as possible through instance-dimension level attention. Then it realizes the classification of legal consulting questions by prototypical networks. Experimental results show that our model achieves state-of-the-art results compared with baseline models. The code and dataset are released on https://github.com/cjm0824/MAPN.


Sign in / Sign up

Export Citation Format

Share Document