scholarly journals Arabic dialect sentiment analysis with ZERO effort. \\ Case study: Algerian dialect

2020 ◽  
Vol 23 (65) ◽  
pp. 124-135
Author(s):  
Imane Guellil ◽  
Marcelo Mendoza ◽  
Faical Azouaou

This paper presents an analytic study showing that it is entirely possible to analyze the sentiment of an Arabic dialect without constructing any resources. The idea of this work is to use the resources dedicated to a given dialect \textit{X} for analyzing the sentiment of another dialect \textit{Y}. The unique condition is to have \textit{X} and \textit{Y} in the same category of dialects. We apply this idea on Algerian dialect, which is a Maghrebi Arabic dialect that suffers from limited available tools and other handling resources required for automatic sentiment analysis. To do this analysis, we rely on Maghrebi dialect resources and two manually annotated sentiment corpus for respectively Tunisian and Moroccan dialect. We also use a large corpus for Maghrebi dialect. We use a state-of-the-art system and propose a new deep learning architecture for automatically classify the sentiment of Arabic dialect (Algerian dialect). Experimental results show that F1-score is up to 83% and it is achieved by Multilayer Perceptron (MLP) with Tunisian corpus and with Long short-term memory (LSTM) with the combination of Tunisian and Moroccan. An improvement of 15% compared to its closest competitor was observed through this study. Ongoing work is aimed at manually constructing an annotated sentiment corpus for Algerian dialect and comparing the results

2021 ◽  
pp. 016555152110065
Author(s):  
Rahma Alahmary ◽  
Hmood Al-Dossari

Sentiment analysis (SA) aims to extract users’ opinions automatically from their posts and comments. Almost all prior works have used machine learning algorithms. Recently, SA research has shown promising performance in using the deep learning approach. However, deep learning is greedy and requires large datasets to learn, so it takes more time for data annotation. In this research, we proposed a semiautomatic approach using Naïve Bayes (NB) to annotate a new dataset in order to reduce the human effort and time spent on the annotation process. We created a dataset for the purpose of training and testing the classifier by collecting Saudi dialect tweets. The dataset produced from the semiautomatic model was then used to train and test deep learning classifiers to perform Saudi dialect SA. The accuracy achieved by the NB classifier was 83%. The trained semiautomatic model was used to annotate the new dataset before it was fed into the deep learning classifiers. The three deep learning classifiers tested in this research were convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (Bi-LSTM). Support vector machine (SVM) was used as the baseline for comparison. Overall, the performance of the deep learning classifiers exceeded that of SVM. The results showed that CNN reported the highest performance. On one hand, the performance of Bi-LSTM was higher than that of LSTM and SVM, and, on the other hand, the performance of LSTM was higher than that of SVM. The proposed semiautomatic annotation approach is usable and promising to increase speed and save time and effort in the annotation process.


Author(s):  
Xingjian Lai ◽  
Huanyi Shui ◽  
Jun Ni

Throughput bottlenecks define and constrain the productivity of a production line. Prediction of future bottlenecks provides a great support for decision-making on the factory floor, which can help to foresee and formulate appropriate actions before production to improve the system throughput in a cost-effective manner. Bottleneck prediction remains a challenging task in literature. The difficulty lies in the complex dynamics of manufacturing systems. There are multiple factors collaboratively affecting bottleneck conditions, such as machine performance, machine degradation, line structure, operator skill level, and product release schedules. These factors impact on one another in a nonlinear manner and exhibit long-term temporal dependencies. State-of-the-art research utilizes various assumptions to simplify the modeling by reducing the input dimensionality. As a result, those models cannot accurately reflect complex dynamics of the bottleneck in a manufacturing system. To tackle this problem, this paper will propose a systematic framework to design a two-layer Long Short-Term Memory (LSTM) network tailored to the dynamic bottleneck prediction problem in multi-job manufacturing systems. This neural network based approach takes advantage of historical high dimensional factory floor data to predict system bottlenecks dynamically considering the future production planning inputs. The model is demonstrated with data from an automotive underbody assembly line. The result shows that the proposed method can achieve higher prediction accuracy compared with current state-of-the-art approaches.


2020 ◽  
Vol 11 (2) ◽  
pp. 131
Author(s):  
Josua Manullang ◽  
Albertus Joko Santoso ◽  
Andi Wahju Rahardjo Emanuel

Abstract. Prediction of tourist visits of Mount Merbabu National Park (TNGMb) needs to be done to control the number of visitors and to preserve the national park. The combination of time series forecasting (TSF) and deep learning methods has become a new alternative for prediction. This case study was conducted to implement several methods combination of TSF and Long-Short Term Memory (LSTM) to predict the visits. In this case study, there are 18 modelling scenarios as research objects to determine the best model by utilizing tourist visits data from 2013 to 2018. The results show that the model applying the lag time method can improve the model's ability to capture patterns on time series data. The error value is measured using the root mean square error (RMSE), with the smallest value of 3.7 in the LSTM architecture, using seven lags as a feature and one lag as a label.Keywords: Tourist Visit, Taman Nasional Gunung Merbabu, Prediction, Recurrent Neural Network, Long-Short Term MemoryAbstrak. Prediksi kunjungan wisatawan Taman Nasional Gunung Merbabu (TNGMb) perlu dilakukan untul pengendalian jumlah pengunjung dan menjaga kelestarian taman nasional. Gabungan metode antara time series forecasting (TSF) dan deep learning telah menjadi alternatif baru untuk melakukan prediksi. Studi kasus ini dilakukan untuk mengimplementasi gabungan dari beberapa macam metode antara TSF dan Long-Short Term Memory (LSTM) untuk memprediksi kunjungan pada TNGMb. Pada studi kasus ini, terdapat 18 skenario pemodelan sebagai objek penelitian untuk menentukan model terbaik, dengan memanfaatkan data jumlah kunjungan wisatawan di TNGMb mulai dari tahun 2013 sampai dengan tahun 2018. Hasil prediksi menunjukkan pemodelan dengan menerapkan metode lag time dapat meningkatakan kemampuan model untuk menangkap pola pada data deret waktu. Besar nilai kesalahan diukur menggunakan root mean square error (RMSE), dengan nilai terkecil sebesar 3,7 pada arsitektur LSTM, menggunakan tujuh lag sebagai feature dan satu lag sebagai label. Kata Kunci: Kunjungan Wisatawan, Taman Nasional Gunung Merbabu, Prediksi, Recurrent Neural Network, Long-Short Term Memory


Author(s):  
Anindita Satria Surya ◽  
Musa Partahi Marbun ◽  
K.G.H. Mangunkusumo ◽  
Muhammad Ridwan

Symmetry ◽  
2019 ◽  
Vol 11 (10) ◽  
pp. 1290 ◽  
Author(s):  
Rahman ◽  
Siddiqui

Abstractive text summarization that generates a summary by paraphrasing a long text remains an open significant problem for natural language processing. In this paper, we present an abstractive text summarization model, multi-layered attentional peephole convolutional LSTM (long short-term memory) (MAPCoL) that automatically generates a summary from a long text. We optimize parameters of MAPCoL using central composite design (CCD) in combination with the response surface methodology (RSM), which gives the highest accuracy in terms of summary generation. We record the accuracy of our model (MAPCoL) on a CNN/DailyMail dataset. We perform a comparative analysis of the accuracy of MAPCoL with that of the state-of-the-art models in different experimental settings. The MAPCoL also outperforms the traditional LSTM-based models in respect of semantic coherence in the output summary.


Sign in / Sign up

Export Citation Format

Share Document