Deep Learning for emotion analysis in Arabic tweets

Abstract Expressing our emotions using text and emojis expressions became widespread through social media such as Facebook, Instagram, Twitter, Weibo, and LinkedIn. Nowadays, both organizations and individuals are interested in using social media to analyze people's opinions and extract sentiments and emotions. We proposed a model for multilabel emotion classification, using a bidirectional Long Short-term Memory BiLSTM deep network. It is evaluated on the Arabic tweets' dataset provided by SemEval 2018 for the E-c task. Several preprocessing steps, including ARLSTEM with some modifications, replacing emojis with corresponding text meaning from a manually built lexicon, and feature vector representation using Aravec word embedding is applied. The novelty in our research that it examines the effect of hyperparameter tuning on model performance, and it uses BiLSTM in all of its deep neural network layers. The proposed model achieves a comparable performance with state-of-the-art models using different machine learning and deep learning techniques. The system achieves about 9% enhancement in validation accuracy compared with the last best model in the same task using Support Vector classifier SVC; it outperforms the other deep neural networks (UNCCTeam) based on fully connected layers in micro F1 metric of about 4.4%.

Download Full-text

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

Computers ◽

10.3390/computers8010004 ◽

2019 ◽

Vol 8 (1) ◽

pp. 4 ◽

Cited By ~ 4

Author(s):

Jurgita Kapočiūtė-Dzikienė ◽

Robertas Damaševičius ◽

Marcin Woźniak

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Approaches ◽

Full Dataset ◽

Learning Techniques ◽

Long Short Term Memory

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Short-Term Memory—LSTM and Convolutional Neural Network—CNN) approaches. The traditional machine learning techniques were used with the features based on the lexical, morphological, and character information. The deep learning approaches were applied on the top of two types of word embeddings (Vord2Vec continuous bag-of-words with negative sampling and FastText). Both traditional and deep learning approaches had to solve the positive/negative/neutral sentiment classification task on the balanced and full dataset versions. The best deep learning results (reaching 0.706 of accuracy) were achieved on the full dataset with CNN applied on top of the FastText embeddings, replaced emoticons, and eliminated diacritics. The traditional machine learning approaches demonstrated the best performance (0.735 of accuracy) on the full dataset with the NBM method, replaced emoticons, restored diacritics, and lemma unigrams as features. Although traditional machine learning approaches were superior when compared to the deep learning methods; deep learning demonstrated good results when applied on the small datasets.

Download Full-text

Comparative Analysis of Deep Learning Techniques for the Classification of Hate Speech

NIGERIAN ANNALS OF PURE AND APPLIED SCIENCES ◽

10.46912/napas.227 ◽

2021 ◽

Vol 4 (1) ◽

pp. 121-128

Author(s):

A Iorliam ◽

S Agber ◽

MP Dzungwe ◽

DK Kwaghtyo ◽

S Bum

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Hate Speech ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Learning Techniques ◽

Or Groups ◽

Long Short Term Memory

Social media provides opportunities for individuals to anonymously communicate and express hateful feelings and opinions at the comfort of their rooms. This anonymity has become a shield for many individuals or groups who use social media to express deep hatred for other individuals or groups, tribes or race, religion, gender, as well as belief systems. In this study, a comparative analysis is performed using Long Short-Term Memory and Convolutional Neural Network deep learning techniques for Hate Speech classification. This analysis demonstrates that the Long Short-Term Memory classifier achieved an accuracy of 92.47%, while the Convolutional Neural Network classifier achieved an accuracy of 92.74%. These results showed that deep learning techniques can effectively classify hate speech from normal speech.

Download Full-text

Airline Delay Prediction using Machine Learning and Deep Learning Techniques

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b4047.079220 ◽

2020 ◽

Vol 9 (2) ◽

pp. 1049-1054

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Random Forest ◽

Short Term Memory ◽

Research Work ◽

Machine Learning Algorithms ◽

Support Vector ◽

Maximum Delay ◽

Learning Techniques ◽

Delay Prediction

In this paper, we have tried to predict flight delays using different machine learning and deep learning techniques. By using such a model it can be easier to predict whether the flight will be delayed or not. Factors like ‘WeatherDelay’, ‘NASDelay’, ‘Destination’, ‘Origin’ play a vital role in this model. Using machine learning algorithms like Random Forest, Support Vector Machine (SVM) and K-Nearest Neighbors (KNN), the f1-score, precision, recall, support and accuracy have been predicted. To add to the model, Long Short-Term Memory (LSTM) RNN architecture has also been employed. In the paper, the dataset from Bureau of Transportation Statistics (BTS) of the ‘Pittsburgh’ is being used. The results computed from the above mentioned algorithms have been compared. Further, the results were visualized for various airlines to find maximum delay and AUC-ROC curve has been plotted for Random Forest Algorithm. The aim of our research work is to predict the delay so as to minimize loses and increase customer satisfaction.

Download Full-text

Deep learning for emotion analysis in Arabic tweets

Journal Of Big Data ◽

10.1186/s40537-021-00523-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Enas A. Hakim Khalil ◽

Enas M. F. El Houby ◽

Hoda Korashy Mohamed

Keyword(s):

Social Media ◽

Deep Learning ◽

Short Term Memory ◽

Classification Problem ◽

Arabic Language ◽

Media Analysis ◽

Support Vector ◽

Human Beings ◽

Emotion Analysis ◽

Filter Noise

AbstractCurrently, expressing feelings through social media requires great consideration as an essential part of our lives; besides sharing ideas and thoughts, we share moments and good memories. Social media such as Facebook, Twitter, Weibo, and LinkedIn, are considered rich sources of opinionated text data. Both organizations and individuals are interested in using social media to analyze people's opinions and extract sentiments and emotions. Most studies on social media analysis mainly classified sentiment as positive, negative, or neutral classes. The challenge in emotion analysis arises because humans can express one or several emotions within one expression. Human beings can recognize these different emotions well; however, it is still not easy for an emotion analysis system. In most cases, the Arabic language used through social media is of a slangy or colloquial form, making it more challenging to preprocess and filter noise since most lemmatization and stemming tools are built on Modern Standard Arabic (MSA). An emotion analysis model has been implemented to categorize emotions. The model is a multiclass and multilabel classification problem. However, few studies have been adapted for this emotion classification problem in Arabic social media. Nearly the only work is the one of SemEval 2018 task1- sub-task E-c. Several machine learning approaches have been implemented in this task; a few studies were based on deep learning. Our model implemented a novel multilayer bidirectional long short term memory (BiLSTM) trained on top of pre-trained word embedding vectors. The model achieved state-of-the-art performance enhancement. This approach has been compared with other models developed in the same tasks using Support Vector Machines (SVM), random forest (RF), and fully connected neural networks. The proposed model achieved a performance improvement over the best results obtained for this task.

Download Full-text

Applying Machine Learning to Identify Anti-Vaccination Tweets during the COVID-19 Pandemic

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18084069 ◽

2021 ◽

Vol 18 (8) ◽

pp. 4069

Author(s):

Quyen G. To ◽

Kien G. To ◽

Van-Anh N. Huynh ◽

Nhung TQ Nguyen ◽

Diep TN Ngo ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Language Processing ◽

Short Term Memory ◽

Model Performance ◽

Vaccine Hesitancy ◽

Support Vector ◽

Short Term ◽

Long Short Term Memory ◽

Use Of Social Media

Anti-vaccination attitudes have been an issue since the development of the first vaccines. The increasing use of social media as a source of health information may contribute to vaccine hesitancy due to anti-vaccination content widely available on social media, including Twitter. Being able to identify anti-vaccination tweets could provide useful information for formulating strategies to reduce anti-vaccination sentiments among different groups. This study aims to evaluate the performance of different natural language processing models to identify anti-vaccination tweets that were published during the COVID-19 pandemic. We compared the performance of the bidirectional encoder representations from transformers (BERT) and the bidirectional long short-term memory networks with pre-trained GLoVe embeddings (Bi-LSTM) with classic machine learning methods including support vector machine (SVM) and naïve Bayes (NB). The results show that performance on the test set of the BERT model was: accuracy = 91.6%, precision = 93.4%, recall = 97.6%, F1 score = 95.5%, and AUC = 84.7%. Bi-LSTM model performance showed: accuracy = 89.8%, precision = 44.0%, recall = 47.2%, F1 score = 45.5%, and AUC = 85.8%. SVM with linear kernel performed at: accuracy = 92.3%, Precision = 19.5%, Recall = 78.6%, F1 score = 31.2%, and AUC = 85.6%. Complement NB demonstrated: accuracy = 88.8%, precision = 23.0%, recall = 32.8%, F1 score = 27.1%, and AUC = 62.7%. In conclusion, the BERT models outperformed the Bi-LSTM, SVM, and NB models in this task. Moreover, the BERT model achieved excellent performance and can be used to identify anti-vaccination tweets in future studies.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

Predicting Future Occurrence of Acute Hypotensive Episodes Using Noninvasive and Invasive Features

Military Medicine ◽

10.1093/milmed/usaa418 ◽

2021 ◽

Vol 186 (Supplement_1) ◽

pp. 445-451

Author(s):

Yifei Sun ◽

Navid Rashedi ◽

Vikrant Vaze ◽

Parikshit Shah ◽

Ryan Halter ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Real World ◽

Short Term Memory ◽

Model Performance ◽

Learning Technologies ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Continuous Map

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.

Download Full-text

Detection of Pavement Maintenance Treatments using Deep-Learning Network

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211007846 ◽

2021 ◽

pp. 036119812110078

Author(s):

Lu Gao ◽

Yao Yu ◽

Yi Hao Ren ◽

Pan Lu

Keyword(s):

Deep Learning ◽

Measurement Errors ◽

Short Term Memory ◽

Learning Networks ◽

Pavement Maintenance ◽

Time Period ◽

Learning Techniques ◽

Maintenance And Rehabilitation ◽

History Of ◽

Deep Learning Network

Pavement maintenance and rehabilitation (M&R) records are important as they provide documentation that M&R treatment is being performed and completed appropriately. Moreover, the development of pavement performance models relies heavily on the quality of the condition data collected and on the M&R records. However, the history of pavement M&R activities is often missing or unavailable to highway agencies for many reasons. Without accurate M&R records, it is difficult to determine if a condition change between two consecutive inspections is the result of M&R intervention, deterioration, or measurement errors. In this paper, we employed deep-learning networks of a convolutional neural network (CNN) model, a long short-term memory (LSTM) model, and a CNN-LSTM combination model to automatically detect if an M&R treatment was applied to a pavement section during a given time period. Unlike conventional analysis methods so far followed, deep-learning techniques do not require any feature extraction. The maximum accuracy obtained for test data is 87.5% using CNN-LSTM.

Download Full-text

PlncRNA-HDeep: plant long noncoding RNA prediction using hybrid deep learning based on two encoding styles

BMC Bioinformatics ◽

10.1186/s12859-020-03870-2 ◽

2021 ◽

Vol 22 (S3) ◽

Author(s):

Jun Meng ◽

Qiang Kang ◽

Zheng Chang ◽

Yushi Luan

Keyword(s):

Deep Learning ◽

Noncoding Rna ◽

Nearest Neighbor ◽

Short Term Memory ◽

Biological Activities ◽

Support Vector ◽

Multiple Perspectives ◽

K Nearest Neighbor ◽

Rna Sequences ◽

Deep Learning Model

Abstract Background Long noncoding RNAs (lncRNAs) play an important role in regulating biological activities and their prediction is significant for exploring biological processes. Long short-term memory (LSTM) and convolutional neural network (CNN) can automatically extract and learn the abstract information from the encoded RNA sequences to avoid complex feature engineering. An ensemble model learns the information from multiple perspectives and shows better performance than a single model. It is feasible and interesting that the RNA sequence is considered as sentence and image to train LSTM and CNN respectively, and then the trained models are hybridized to predict lncRNAs. Up to present, there are various predictors for lncRNAs, but few of them are proposed for plant. A reliable and powerful predictor for plant lncRNAs is necessary. Results To boost the performance of predicting lncRNAs, this paper proposes a hybrid deep learning model based on two encoding styles (PlncRNA-HDeep), which does not require prior knowledge and only uses RNA sequences to train the models for predicting plant lncRNAs. It not only learns the diversified information from RNA sequences encoded by p-nucleotide and one-hot encodings, but also takes advantages of lncRNA-LSTM proposed in our previous study and CNN. The parameters are adjusted and three hybrid strategies are tested to maximize its performance. Experiment results show that PlncRNA-HDeep is more effective than lncRNA-LSTM and CNN and obtains 97.9% sensitivity, 95.1% precision, 96.5% accuracy and 96.5% F1 score on Zea mays dataset which are better than those of several shallow machine learning methods (support vector machine, random forest, k-nearest neighbor, decision tree, naive Bayes and logistic regression) and some existing tools (CNCI, PLEK, CPC2, LncADeep and lncRNAnet). Conclusions PlncRNA-HDeep is feasible and obtains the credible predictive results. It may also provide valuable references for other related research.

Download Full-text

Deep Learning Methods for Classification of Certain Abnormalities in Echocardiography

Electronics ◽

10.3390/electronics10040495 ◽

2021 ◽

Vol 10 (4) ◽

pp. 495

Author(s):

Imayanmosha Wahlang ◽

Arnab Kumar Maji ◽

Goutam Saha ◽

Prasun Chakrabarti ◽

Michal Jasinski ◽

...

Keyword(s):

Deep Learning ◽

Short Term Memory ◽

Support Vector ◽

Variational Autoencoder ◽

Different Types ◽

Static Images ◽

Long Short Term Memory ◽

2D And 3D ◽

Better Than

This article experiments with deep learning methodologies in echocardiogram (echo), a promising and vigorously researched technique in the preponderance field. This paper involves two different kinds of classification in the echo. Firstly, classification into normal (absence of abnormalities) or abnormal (presence of abnormalities) has been done, using 2D echo images, 3D Doppler images, and videographic images. Secondly, based on different types of regurgitation, namely, Mitral Regurgitation (MR), Aortic Regurgitation (AR), Tricuspid Regurgitation (TR), and a combination of the three types of regurgitation are classified using videographic echo images. Two deep-learning methodologies are used for these purposes, a Recurrent Neural Network (RNN) based methodology (Long Short Term Memory (LSTM)) and an Autoencoder based methodology (Variational AutoEncoder (VAE)). The use of videographic images distinguished this work from the existing work using SVM (Support Vector Machine) and also application of deep-learning methodologies is the first of many in this particular field. It was found that deep-learning methodologies perform better than SVM methodology in normal or abnormal classification. Overall, VAE performs better in 2D and 3D Doppler images (static images) while LSTM performs better in the case of videographic images.

Download Full-text