redBERT: A Topic Discovery and Deep Sentiment Classification Model on COVID-19 Online Discussions Using BERT NLP Model

Mapping Intimacies ◽

10.1101/2021.03.02.21252747 ◽

2021 ◽

Author(s):

Chaitanya Pandey

Keyword(s):

Language Processing ◽

Online Discussions ◽

Sentiment Classification ◽

Classification Model ◽

Computational Techniques ◽

Topic Modelling ◽

Wide Scale ◽

Human Psyche ◽

Shed Light

A Natural Language Processing (NLP) method was used to uncover various issues and sentiments surrounding COVID-19 from social media and get a deeper understanding of fluctuating public opinion in situations of wide-scale panic to guide improved decision making with the help of a sentiment analyser created for the automated extraction of COVID-19 related discussions based on topic modelling. Moreover, the BERT model was used for the sentiment classification of COVID-19 Reddit comments. These findings shed light on the importance of studying trends and using computational techniques to assess human psyche in times of distress.

Download Full-text

redBERT

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2021070103 ◽

2021 ◽

Vol 12 (3) ◽

pp. 32-47

Author(s):

Chaitanya Pandey

Keyword(s):

Decision Making ◽

Social Media ◽

Language Processing ◽

Computational Techniques ◽

Topic Modelling ◽

Automated Extraction ◽

Wide Scale ◽

Human Psyche ◽

Shed Light

A natural language processing (NLP) method was used to uncover various issues and sentiments surrounding COVID-19 from social media and get a deeper understanding of fluctuating public opinion in situations of wide-scale panic to guide improved decision making with the help of a sentiment analyser created for the automated extraction of COVID-19-related discussions based on topic modelling. Moreover, the BERT model was used for the sentiment classification of COVID-19 Reddit comments. These findings shed light on the importance of studying trends and using computational techniques to assess the human psyche in times of distress.

Download Full-text

Application of Latent Dirichlet Allocation (LDA) for clustering financial tweets

E3S Web of Conferences ◽

10.1051/e3sconf/202129701071 ◽

2021 ◽

Vol 297 ◽

pp. 01071

Author(s):

Sifi Fatima-Zahrae ◽

Sabbar Wafae ◽

El Mzabi Amal

Keyword(s):

Language Processing ◽

Latent Dirichlet Allocation ◽

Sentiment Classification ◽

Research Areas ◽

Preprocessing Method ◽

Long Time ◽

Standard Text ◽

The Given ◽

Dirichlet Allocation

Sentiment classification is one of the hottest research areas among the Natural Language Processing (NLP) topics. While it aims to detect sentiment polarity and classification of the given opinion, requires a large number of aspect extractions. However, extracting aspect takes human effort and long time. To reduce this, Latent Dirichlet Allocation (LDA) method have come out recently to deal with this issue.In this paper, an efficient preprocessing method for sentiment classification is presented and will be used for analyzing user’s comments on Twitter social network. For this purpose, different text preprocessing techniques have been used on the dataset to achieve an acceptable standard text. Latent Dirichlet Allocation has been applied on the obtained data after this fast and accurate preprocessing phase. The implementation of different sentiment analysis methods and the results of these implementations have been compared and evaluated. The experimental results show that the combined uses of the preprocessing method of this paper and Latent Dirichlet Allocation have an acceptable results compared to other basic methods.

Download Full-text

Social Media Rumor Refuter Feature Analysis and Crowd Identification Based on XGBoost and NLP

Applied Sciences ◽

10.3390/app10144711 ◽

2020 ◽

Vol 10 (14) ◽

pp. 4711 ◽

Cited By ~ 1

Author(s):

Zongmin Li ◽

Qi Zhang ◽

Yuhong Wang ◽

Shihang Wang

Keyword(s):

Social Media ◽

Language Processing ◽

Classification Model ◽

Dark Side ◽

Feature Analysis ◽

Online Information ◽

Sina Weibo ◽

Machine Learning Methods ◽

Model Training ◽

Shed Light

One prominent dark side of online information behavior is the spreading of rumors. The feature analysis and crowd identification of social media rumor refuters based on machine learning methods can shed light on the rumor refutation process. This paper analyzed the association between user features and rumor refuting behavior in five main rumor categories: economics, society, disaster, politics, and military. Natural language processing (NLP) techniques are applied to quantify the user’s sentiment tendency and recent interests. Then, those results were combined with other personalized features to train an XGBoost classification model, and potential refuters can be identified. Information from 58,807 Sina Weibo users (including their 646,877 microblogs) for the five anti-rumor microblog categories was collected for model training and feature analysis. The results revealed that there were significant differences between rumor stiflers and refuters, as well as between refuters for different categories. Refuters tended to be more active on social media and a large proportion of them gathered in more developed regions. Tweeting history was a vital reference as well, and refuters showed higher interest in topics related with the rumor refuting message. Meanwhile, features such as gender, age, user labels and sentiment tendency also varied between refuters considering categories.

Download Full-text

Concept of TF-IDF, Common Bag of Word and Word Embedding for Effective Sentiment Classification

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f4582.049620 ◽

2020 ◽

Vol 9 (4) ◽

pp. 2198-2201

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Sentiment Classification ◽

Word Embedding ◽

Text Representation ◽

Human Beings ◽

Text Data

Sentiment Classification is one of the well-known and most popular domain of machine learning and natural language processing. An algorithm is developed to understand the opinion of an entity similar to human beings. This research fining article presents a similar to the mention above. Concept of natural language processing is considered for text representation. Later novel word embedding model is proposed for effective classification of the data. Tf-IDF and Common BoW representation models were considered for representation of text data. Importance of these models are discussed in the respective sections. The proposed is testing using IMDB datasets. 50% training and 50% testing with three random shuffling of the datasets are used for evaluation of the model.

Download Full-text

Deep Sentiment Classification and Topic Discovery on Novel Coronavirus or COVID-19 Online Discussions: NLP Using LSTM Recurrent Neural Network Approach

10.1101/2020.04.22.054973 ◽

2020 ◽

Cited By ~ 3

Author(s):

Hamed Jelodar ◽

Yongli Wang ◽

Rita Orji ◽

Hucheng Huang

Keyword(s):

Neural Network ◽

Social Media ◽

Recurrent Neural Network ◽

Online Discussions ◽

Sentiment Classification ◽

World Health ◽

Computational Techniques ◽

Public Opinions ◽

The World ◽

Novel Coronavirus

AbstractInternet forums and public social media, such as online healthcare forums, provide a convenient channel for users (people/patients) concerned about health issues to discuss and share information with each other. In late December 2019, an outbreak of a novel coronavirus (infection from which results in the disease named COVID-19) was reported, and, due to the rapid spread of the virus in other parts of the world, the World Health Organization declared a state of emergency. In this paper, we used automated extraction of COVID-19–related discussions from social media and a natural language process (NLP) method based on topic modeling to uncover various issues related to COVID-19 from public opinions. Moreover, we also investigate how to use LSTM recurrent neural network for sentiment classification of COVID-19 comments. Our findings shed light on the importance of using public opinions and suitable computational techniques to understand issues surrounding COVID-19 and to guide related decision-making.

Download Full-text

Student sentiment classification model based on GRU neural network and TF-IDF algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189227 ◽

2020 ◽

pp. 1-11

Author(s):

Hailong Yu ◽

Yannan Ji ◽

Qinglin Li

Keyword(s):

Neural Network ◽

Sentiment Classification ◽

Classification Model ◽

Research Model ◽

Student Group ◽

Comprehensive Performance ◽

Sentiment Dictionary ◽

Neural Network Algorithm ◽

Sentence Matching

Due to the diversity of text expressions, the text sentiment classification algorithm based on semantic understanding is difficult to establish a perfect sentiment dictionary and sentence matching template, which leads to strong limitations of the algorithm. In particular, it has certain difficulties in the classification of student sentiments. Based on this, this paper analyzes the student sentiment classification model by neural network algorithm and uses the student group as an example to explore the application of neural network model in sentiment classification. Moreover, the regularization method is added to the loss function of LSTM so that the output at any time is related to the output at the previous time. In addition, the sentimental drift distribution of sentimental words on each sentimental label is added to the regularizer, and the sentimental information is merged with the two-way LSTM to allow the model to choose forward or reverse. Finally, in order to verify the research model, the performance of the model proposed in this paper is studied through experimental research. The research shows that the model proposed in this paper has better comprehensive performance than the traditional model and can meet the actual needs of students’ sentiment classification.

Download Full-text

A multi-label text classification model based on ELMo and attention

MATEC Web of Conferences ◽

10.1051/matecconf/202030903015 ◽

2020 ◽

Vol 309 ◽

pp. 03015

Author(s):

Wenbin Liu ◽

Bojian Wen ◽

Shang Gao ◽

Jiesheng Zheng ◽

Yinlong Zheng

Keyword(s):

Language Processing ◽

Text Classification ◽

Attention Mechanism ◽

Sentiment Classification ◽

Classification Model ◽

Data Set ◽

Model Based ◽

Related Information ◽

Related Text ◽

Common Application

Text classification is a common application in natural language processing. We proposed a multi-label text classification model based on ELMo and attention mechanism which help solve the problem for the sentiment classification task that there is no grammar or writing convention in power supply related text and the sentiment related information disperses in the text. Firstly, we use pre-trained word embedding vector to extract the feature of text from the Internet. Secondly, the analyzed deep information features are weighted according to the attention mechanism. Finally, an improved ELMo model in which we replace the LSTM module with GRU module is used to characterize the text and information is classified. The experimental results on Kaggle’s toxic comment classification data set show that the accuracy of sentiment classification is as high as 98%.

Download Full-text

A novel sentiment classification model based on online learning

Journal of Algorithms & Computational Technology ◽

10.1177/1748302619845764 ◽

2019 ◽

Vol 13 ◽

pp. 174830261984576

Author(s):

Ningjia Qiu ◽

Zhuorui Shen ◽

Xiaojuan Hu ◽

Peng Wang

Keyword(s):

Experimental Data ◽

Online Learning ◽

Learning Algorithm ◽

Learning Rate ◽

Sentiment Classification ◽

Classification Model ◽

Model Based ◽

Adaptive Adjustment ◽

Online Learning Algorithm

Memory limitation and slow training speed are two important problems in sentiment analysis. In this paper, we propose a sentiment classification model based on online learning to improve the training speed of the sentiment classification. First, combining the adaptive adjustment of learning rate of the Adadelta algorithm and the characteristics of avoid frequent jitter of Adam algorithm in the later stage of training, we present a novel Adamdelta algorithm. It solves the problem that learning rate of traditional follow the regularized leader (FTRL)-Proximal online learning algorithm will disappear with the increase of training times. Moreover, we gain an optimized logistic regression (LR) model and use it to the sentiment classification of online learning. Finally, we compare the proposed algorithm with five similar models with the experimental data of the IMDb movie review dataset. Experimental results show that the improved algorithm has better classification effect and can effectively improve the precision and recall of the classifier.

Download Full-text

A Framework to Understand Attitudes towards Immigration through Twitter

Applied Sciences ◽

10.3390/app11209689 ◽

2021 ◽

Vol 11 (20) ◽

pp. 9689

Author(s):

Yerka Freire-Vidal ◽

Eduardo Graells-Garrido ◽

Francisco Rowe

Keyword(s):

Civil Rights ◽

Language Processing ◽

Negative Attitude ◽

Online Discussions ◽

Negative Attitudes ◽

Policy Makers ◽

Positive Attitudes ◽

Policy Communication ◽

Temporal Granularity

Understanding public opinion towards immigrants is key to prevent acts of violence, discrimination and abuse. Traditional data sources, such as surveys, provide rich insights into the formation of such attitudes; yet, they are costly and offer limited temporal granularity, providing only a partial understanding of the dynamics of attitudes towards immigrants. Leveraging Twitter data and natural language processing, we propose a framework to measure attitudes towards immigration in online discussions. Grounded in theories of social psychology, the proposed framework enables the classification of users’ into profile stances of positive and negative attitudes towards immigrants and characterisation of these profiles quantitatively summarising users’ content and temporal stance trends. We use a Twitter sample composed of 36 K users and 160 K tweets discussing the topic in 2017, when the immigrant population in the country recorded an increase by a factor of four from 2010. We found that the negative attitude group of users is smaller than the positive group, and that both attitudes have different distributions of the volume of content. Both types of attitudes show fluctuations over time that seem to be influenced by news events related to immigration. Accounts with negative attitudes use arguments of labour competition and stricter regulation of immigration. In contrast, accounts with positive attitudes reflect arguments in support of immigrants’ human and civil rights. The framework and its application can inform policy makers about how people feel about immigration, with possible implications for policy communication and the design of interventions to improve negative attitudes.

Download Full-text

A Classification Model of Legal Consulting Questions Based on Multi-Attention Prototypical Networks

International Journal of Computational Intelligence Systems ◽

10.1007/s44196-021-00053-6 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Jianzhou Feng ◽

Jinman Cui ◽

Qikai Wei ◽

Zhengji Zhou ◽

Yuxiong Wang

Keyword(s):

Supervised Learning ◽

Language Processing ◽

Text Classification ◽

Question Answering ◽

Training Data ◽

Classification Model ◽

Great Progress ◽

Public Datasets ◽

The Cost

AbstractText classification is a research hotspot in the field of natural language processing. Existing text classification models based on supervised learning, especially deep learning models, have made great progress on public datasets. But most of these methods rely on a large amount of training data, and these datasets coverage is limited. In the legal intelligent question-answering system, accurate classification of legal consulting questions is a necessary prerequisite for the realization of intelligent question answering. However, due to lack of sufficient annotation data and the cost of labeling is high, which lead to the poor effect of traditional supervised learning methods under sparse labeling. In response to the above problems, we construct a few-shot legal consulting questions dataset, and propose a prototypical networks model based on multi-attention. For the same category of instances, this model first highlights the key features in the instances as much as possible through instance-dimension level attention. Then it realizes the classification of legal consulting questions by prototypical networks. Experimental results show that our model achieves state-of-the-art results compared with baseline models. The code and dataset are released on https://github.com/cjm0824/MAPN.

Download Full-text