scholarly journals Multi-aspect sentiment analysis on netflix application using latent dirichlet allocation and support vector machine methods

2021 ◽  
Vol 13 (3) ◽  
pp. 128-133
Author(s):  
Attala Rafid Abelard ◽  
Yuliant Sibaroni

Among many film streaming platforms that have sprung up, Netflix is ​​the platform that has the most subscribers compared to the other platforms. However, not all reviews provided by the Netflix users are good reviews. These reviews will later be analyzed to determine what aspects are reviewed by the users based on reviews written on the Google Play Store, using the Latent Dirichlet Allocation (LDA) method. Then, the classification process using the Support Vector Machine (SVM) method will be carried out to determine whether each of these reviews is included in the positive or negative class (Sentiment Analysis). There are 2 scenarios that were carried out in this study. The first scenario resulted that the best number of LDA topics to be used is 40, and the second scenario resulted that the use of filtering process in the preprocessing stage reduces the score of the f1-score. Thus, this study resulted in the best performance score on LDA and SVM testing with 40 topics, and without running the filtering process with the score of 78.15%.

Author(s):  
Jalel Akaichi

In this work, we focus on the application of text mining and sentiment analysis techniques for analyzing Tunisian users' statuses updates on Facebook. We aim to extract useful information, about their sentiment and behavior, especially during the “Arabic spring” era. To achieve this task, we describe a method for sentiment analysis using Support Vector Machine and Naïve Bayes algorithms, and applying a combination of more than two features. The output of this work consists, on one hand, on the construction of a sentiment lexicon based on the Emoticons and Acronyms' lexicons that we developed based on the extracted statuses updates; and on the other hand, it consists on the realization of detailed comparative experiments between the above algorithms by creating a training model for sentiment classification.


Author(s):  
Lutfi Budi Ilmawan ◽  
Edi Winarko

AbstrakGoogle dalam application store-nya, Google Play, saat ini telah menyediakan sekitar 1.200.000 aplikasi mobile. Dengan sejumlah aplikasi tersebut membuat pengguna memiliki banyak pilihan. Selain itu, pengembang aplikasi mengalami kesulitan dalam mencari tahu bagaimana meningkatkan kinerja aplikasinya. Dengan adanya permasalahan tersebut, maka dibutuhkan sebuah aplikasi analisis sentimen yang dapat mengolah sejumlah komentar untuk memperoleh informasi.Sistem yang dibangun memiliki tujuan untuk menentukan polaritas sentimen dari ulasan tekstual aplikasi pada Google Play yang dilakukan dari perangkat mobile. Perangkat mobile memiliki portabilitas yang tinggi dan sebagian dari perangkat tersebut memiliki resource yang terbatas. Hal tersebut diatasi dengan menggunakan arsitektur sistem berbasis client server, di mana server melakukan tugas-tugas yang berat sementara client-nya adalah perangkat mobile yang hanya mengerjakan tugas yang ringan. Dengan solusi tersebut maka Analisis sentimen dapat diaplikasikan pada mobile environment.Adapun metode klasifikasi yang digunakan adalah Naïve Bayes untuk aplikasi yang dikembangkan dan Support Vector Machine Linier sebagai pembanding. Nilai akurasi dari Naïve Bayes classifier dari aplikasi yang dibangun sebesar 83,87% lebih rendah jika dibandingkan dengan nilai akurasi dari SVM Linier classifier sebesar 89,49%. Adapun penggunaan semantic handling untuk mengatasi sinonim kata dapat mengurangi akurasi classifier. Kata kunci— analisis sentimen, google play, klasifikasi, naïve bayes, support vector machine AbstractGoogle's Google Play now providing approximately 1.200.000 mobile applications. With these number of applications, it makes the users have many options. In addition, application developers have difficulties in figuring out how to improve their application performance. Because of these problems, it is necessary to make a sentiment analysis applications that can process review comments to get valuable information.The purpose of this system is determining the polarity of sentiments from applications’s textual reviews on Google Play that can be performed on mobile devices. The mobile device has high portability and the majority of these devices have limited resource. That problem can be solved by using a client server based system architecture, where the server performs training and classification tasks while clients is a mobile device that perform some of sentiment analysis task. With this solution, the sentiment analysis can be applied to the mobile environment.The classification method that used are Naive Bayes for developed application and Linear Support Vector Machine that is used for comparing. Naïve Bayes classifier’s accuracy is 83.87%. The result is lower than the accuracy value of Linear SVM classifier that reach 89.49%. The use of semantic handling can reduce the accuracy of the classifier. Keywords—sentiment analysis, google play, classification, naïve bayes, support vector machine


Author(s):  
Daniel Febrian Sengkey ◽  
Agustinus Jacobus ◽  
Fabian Johanes Manoppo

Support vector machine (SVM) is a known method for supervised learning in sentiment analysis and there are many studies about the use of SVM in classifying the sentiments in lecturer evaluation. SVM has various parameters that can be tuned and kernels that can be chosen to improve the classifier accuracy. However, not all options have been explored. Therefore, in this study we compared the four SVM kernels: radial, linear, polynomial, and sigmoid, to discover how each kernel influences the accuracy of the classifier. To make a proper assessment, we used our labeled dataset of students’ evaluations toward the lecturer. The dataset was split, one for training the classifier, and another one for testing the model. As an addition, we also used several different ratios of the training:testing dataset. The split ratios are 0.5 to 0.95, with the increment factor of 0.05. The dataset was split randomly, hence the splitting-training-testing processes were repeated 1,000 times for each kernel and splitting ratio. Therefore, at the end of the experiment, we got 40,000 accuracy data. Later, we applied statistical methods to see whether the differences are significant. Based on the statistical test, we found that in this particular case, the linear kernel significantly has higher accuracy compared to the other kernels. However, there is a tradeoff, where the results are getting more varied with a higher proportion of data used for training.


SINERGI ◽  
2020 ◽  
Vol 24 (2) ◽  
pp. 87
Author(s):  
Mona Cindo ◽  
Dian Palupi Rini ◽  
Ermatita Ermatita

With the advancement of social media and its growth, there is a lot of data that can be presented for research in social mining. Twitter is a microblogging that can be used. In this event, a lot of companies used the data on Twitter to analyze the satisfaction of their customer about product quality. On the other hand, a lot of users use social media to express their daily emotions. The case can be developed into a research study that can be used both to improve product quality, as well as to analyze the opinion on certain events. The research is often called sentiment analysis or opinion mining. While The previous research does a particularly useful feature for sentiment analysis, but it is still a lack of performance. Furthermore, they used Support Vector Machine as a classification method. On the other hand, most researchers found another classification method, which is considered more efficient such as Maximum Entropy. So, this research used two types of a dataset, the general opinion data, and the airline's opinion data. For feature extraction, we employ four feature extraction, such as pragmatic, lexical-grams, pos-grams, and sentiment lexical. For the classification, we use both of Support Vector Machine and Maximum Entropy to find the best result. In the end, the best result is performed by Maximum Entropy with 85,8% accuracy on general opinion data, and 92,6% accuracy on airlines opinion data.


2021 ◽  
Vol 8 (6) ◽  
pp. 1265
Author(s):  
Muhammad Alkaff ◽  
Andreyan Rizky Baskara ◽  
Irham Maulani

<p>Sebuah sistem layanan untuk menyampaikan aspirasi dan keluhan masyarakat terhadap layanan pemerintah Indonesia, bernama Lapor! Pemerintah sudah lama memanfaatkan sistem tersebut untuk menjawab permasalahan masyarakat Indonesia terkait permasalahan birokrasi. Namun, peningkatan volume laporan dan pemilahan laporan yang dilakukan oleh operator dengan membaca setiap keluhan yang masuk melalui sistem menyebabkan sering terjadi kesalahan dimana operator meneruskan laporan tersebut ke instansi yang salah. Oleh karena itu, diperlukan suatu solusi yang dapat menentukan konteks laporan secara otomatis dengan menggunakan teknik Natural Language Processing. Penelitian ini bertujuan untuk membangun klasifikasi laporan secara otomatis berdasarkan topik laporan yang ditujukan kepada instansi yang berwenang dengan menggabungkan metode Latent Dirichlet Allocation (LDA) dan Support Vector Machine (SVM). Proses pemodelan topik untuk setiap laporan dilakukan dengan menggunakan metode LDA. Metode ini mengekstrak laporan untuk menemukan pola tertentu dalam dokumen yang akan menghasilkan keluaran dalam nilai distribusi topik. Selanjutnya, proses klasifikasi untuk menentukan laporan agensi tujuan dilakukan dengan menggunakan SVM berdasarkan nilai topik yang diekstraksi dengan metode LDA. Performa model LDA-SVM diukur dengan menggunakan confusion matrix dengan menghitung nilai akurasi, presisi, recall, dan F1 Score. Hasil pengujian menggunakan teknik split train-test dengan skor 70:30 menunjukkan bahwa model menghasilkan kinerja yang baik dengan akurasi 79,85%, presisi 79,98%, recall 72,37%, dan Skor F1 74,67%.</p><p> </p><p><em><strong>Abstract</strong></em></p><p><em>A service system to convey aspirations and complaints from the public against Indonesia's government services, named Lapor! The Government has used the Government for a long time to answer the problems of the Indonesian people related to bureaucratic problems. However, the increasing volume of reports and the sorting of reports carried out by operators by reading every complaint that comes through the system cause frequent errors where operators forward the reports to the wrong agencies. Therefore, we need a solution that can automatically determine the report's context using Natural Language Processing techniques. This study aims to build automatic report classifications based on report topics addressed to authorized agencies by combining Latent Dirichlet Allocation (LDA) and Support Vector Machine (SVM). The topic-modeling process for each report was carried out using the LDA method. This method extracts reports to find specific patterns in documents that will produce output in topic distribution values. Furthermore, the classification process to determine the report's destination agency carried out using the SVM based on the value of the topics extracted by the LDA method. The LDA-SVM model's performance is measured using a confusion matrix by calculating the value of accuracy, precision, recall, and F1 Score. The test results using the train-test split technique with a 70:30 show that the model produces good performance with 79.85% accuracy, 79.98% precision, 72.37% recall, and 74.67% F1 Score</em></p><p><em><strong><br /></strong></em></p>


2018 ◽  
Vol 61 (1) ◽  
pp. 64-76 ◽  
Author(s):  
Susan (Sixue) Jia

Fitness clubs have never ceased searching for quality improvement opportunities to better serve their exercisers, whereas exercisers have been posting online ratings and reviews regarding fitness clubs. Studied together, the quantitative rating and qualitative review can provide a comprehensive depiction of exercisers’ perception of fitness clubs. However, the typological and dimensional discrepancies of online rating and review have hindered the joint study of the two data sets to fully exploit their business value. To this end, this study bridges the gap by examined 53,979 pairs of exerciser online rating and review from 100 fitness clubs in Shanghai, China. Using latent Dirichlet allocation (LDA) based text mining, we identified the 17 major topics on which the exercisers were writing. A support vector machine (SVM) classifier was then employed to establish the rating-review relations, with an accuracy rate of up to 86%. Finally, the relative impact of each topic on exerciser satisfaction was computed and compared by introducing virtual reviews. The significance of this study is that it systematically creates a standardized protocol of mining and correlating the massive structured/quantitative and unstructured/qualitative data available online, which is readily transferable to the other service and product sectors.


2020 ◽  
Vol 1641 ◽  
pp. 012102
Author(s):  
Hermanto ◽  
Antonius Yadi Kuntoro ◽  
Taufik Asra ◽  
Eri Bayu Pratama ◽  
Lasman Effendi ◽  
...  

2021 ◽  
Vol 5 (4) ◽  
pp. 631-638
Author(s):  
Janu Akrama Wardhana ◽  
Yuliant Sibaroni

During the Covid-19 pandemic, almost all community activities are conducted from home. Therefore, video conference technology is needed for people to carry out their normal activities from home. One of the video conference applications is ZOOM Cloud Meetings. Applications certainly have been reviewed given by their users as a reference for new users and companies of the application to know the application’s performance. However, in reviews, some constraints are the number of reviews as well as irregular. Therefore, a solution is needed with sentiment analysis that aims to classify the reviews of the application to be organized by categorizing positive or negative sentiment. In this study, aspect-based sentiment analysis was conducted on ZOOM Cloud Meetings app reviews from Google Play Store. The analysis’s result of the review data obtained three aspects, namely aspects of usability, system, and appearance. The modeling topic used is the Latent Dirichlet Allocation (LDA) method and classification using the Support Vector Machine (SVM). This research resulted in the best performance with the best parameters resulting in the performance accuracy of usability aspect is 88.83%, system aspect with 91.2%, appearance aspect with 94.78%, and performance accuracy of all aspects 91.61%.


Sign in / Sign up

Export Citation Format

Share Document