Topic Modeling for Twitter Users Regarding the "Ruanggguru" Application

Bagus Wicaksono Arianto; Gangga Anuraga

doi:10.19184/jid.v21i2.17112

Topic Modeling for Twitter Users Regarding the "Ruanggguru" Application

Jurnal ILMU DASAR ◽

10.19184/jid.v21i2.17112 ◽

2020 ◽

Vol 21 (2) ◽

pp. 149

Author(s):

Bagus Wicaksono Arianto ◽

Gangga Anuraga

Keyword(s):

Topic Modeling ◽

Public Perception ◽

Latent Dirichlet Allocation ◽

The Public ◽

Allocation Method ◽

Twitter Account ◽

Twitter Users ◽

A Company ◽

Dirichlet Allocation ◽

Expansion Strategies

PT Ruang Raya Indonesia ("Ruangguru") is the largest and most comprehensive technology company in Indonesia that focuses on education-based services. In 2019 there were 15 million Ruangguru users and 300.00 teachers who had joined and were present in 32 provinces in Indonesia. It prepared a number of expansion strategies to become a company valued at more than US $ 1 billion in the next year or two. The purpose of this research is to classify the opinions of Ruangguru users about the services provided so that it can be an evaluation material in improving their services using the latent direchlet allocation method. The data used comes from a collection of tweets of Twitter users in Indonesia using the Twitter API. The Twitter account used in this study is @ruangguru. The results of the analysis showed that the public perception of Twitter users by using latent dirichlet allocation was formed into 28 topics.Keywords: latent dirichlet allocation, ruangguru, twitter.

Download Full-text

Analysis and Visualization Latent Topic on COVID-19 Vaccine Tweet use two-stage topic modeling (Preprint)

10.2196/preprints.30290 ◽

2021 ◽

Author(s):

Faizah Faizah ◽

Bor-Shen Lin

Keyword(s):

Topic Modeling ◽

Public Perception ◽

Latent Dirichlet Allocation ◽

World Health ◽

Two Stage ◽

The Public ◽

Global Pandemic ◽

Difficult Time ◽

Latent Topic ◽

Latent Topics

BACKGROUND The World Health Organization (WHO) declared COVID-19 as a global pandemic on January 30, 2020. However, the pandemic has not been over yet. Furthermore, in the first quartal of 2021, some countries face the third wave of the pandemic. During the difficult time, the development of the vaccines for COVID-19 accelerates rapidly. Understanding the public perception of the COVID-19 Vaccine according to the data collected from social media can widen the perspective on the state of the global pandemic OBJECTIVE This study explores and analyzes the latent topic on COVID-19 Vaccine Tweet posted by individuals from various countries by using two-stage topic modeling. METHODS A two-stage analysis in topic modeling was proposed to investigating people’s reactions in five countries. The first stage is Latent Dirichlet Allocation that produces the latent topics with the corresponding term distributions that facilitate the investigators to understand the main issues or opinions. The second stage then performs agglomerative clustering on the latent topics based on Hellinger distance, which merges close topics hierarchically into topic clusters to visualize those topics in either tree or graph views. RESULTS In general, the topic discussion regarding the COVID-19 Vaccine in five countries is similar. Topic themes such as "first vaccine" and & "vaccine effect" dominate the public discussion. The remarkable point is that people in some countries have some topic themes, such as "politician opinion" and " stay home" in Canada, "emergency" in India, and & "blood clots" in the United Kingdom. The analysis also shows the most popular COVID-19 Vaccine, which is gaining more public interest. CONCLUSIONS With LDA and Hierarchical clustering, two-stage topic modeling is powerful for visualizing the latent topics and understanding the public perception regarding the COVID-19 Vaccine.

Download Full-text

Public Perception of the COVID-19 Pandemic on Twitter: Sentiment Analysis and Topic Modeling Study (Preprint)

10.2196/preprints.21978 ◽

2020 ◽

Author(s):

Sakun Boon-Itt ◽

Yukolpat Skunkan

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Topic Modeling ◽

Public Perception ◽

English Language ◽

Latent Dirichlet Allocation ◽

Public Awareness ◽

Good Communication ◽

Three Stages ◽

Twitter Users

BACKGROUND COVID-19 is a scientifically and medically novel disease that is not fully understood because it has yet to be consistently and deeply studied. Among the gaps in research on the COVID-19 outbreak, there is a lack of sufficient infoveillance data. OBJECTIVE The aim of this study was to increase understanding of public awareness of COVID-19 pandemic trends and uncover meaningful themes of concern posted by Twitter users in the English language during the pandemic. METHODS Data mining was conducted on Twitter to collect a total of 107,990 tweets related to COVID-19 between December 13 and March 9, 2020. The analyses included frequency of keywords, sentiment analysis, and topic modeling to identify and explore discussion topics over time. A natural language processing approach and the latent Dirichlet allocation algorithm were used to identify the most common tweet topics as well as to categorize clusters and identify themes based on the keyword analysis. RESULTS The results indicate three main aspects of public awareness and concern regarding the COVID-19 pandemic. First, the trend of the spread and symptoms of COVID-19 can be divided into three stages. Second, the results of the sentiment analysis showed that people have a negative outlook toward COVID-19. Third, based on topic modeling, the themes relating to COVID-19 and the outbreak were divided into three categories: the COVID-19 pandemic emergency, how to control COVID-19, and reports on COVID-19. CONCLUSIONS Sentiment analysis and topic modeling can produce useful information about the trends in the discussion of the COVID-19 pandemic on social media as well as alternative perspectives to investigate the COVID-19 crisis, which has created considerable public awareness. This study shows that Twitter is a good communication channel for understanding both public concern and public awareness about COVID-19. These findings can help health departments communicate information to alleviate specific public concerns about the disease.

Download Full-text

Public Perception of the COVID-19 Pandemic on Twitter: Sentiment Analysis and Topic Modeling Study

JMIR Public Health and Surveillance ◽

10.2196/21978 ◽

2020 ◽

Vol 6 (4) ◽

pp. e21978

Author(s):

Sakun Boon-Itt ◽

Yukolpat Skunkan

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Topic Modeling ◽

Public Perception ◽

English Language ◽

Latent Dirichlet Allocation ◽

Public Awareness ◽

Good Communication ◽

Three Stages ◽

Twitter Users

Background COVID-19 is a scientifically and medically novel disease that is not fully understood because it has yet to be consistently and deeply studied. Among the gaps in research on the COVID-19 outbreak, there is a lack of sufficient infoveillance data. Objective The aim of this study was to increase understanding of public awareness of COVID-19 pandemic trends and uncover meaningful themes of concern posted by Twitter users in the English language during the pandemic. Methods Data mining was conducted on Twitter to collect a total of 107,990 tweets related to COVID-19 between December 13 and March 9, 2020. The analyses included frequency of keywords, sentiment analysis, and topic modeling to identify and explore discussion topics over time. A natural language processing approach and the latent Dirichlet allocation algorithm were used to identify the most common tweet topics as well as to categorize clusters and identify themes based on the keyword analysis. Results The results indicate three main aspects of public awareness and concern regarding the COVID-19 pandemic. First, the trend of the spread and symptoms of COVID-19 can be divided into three stages. Second, the results of the sentiment analysis showed that people have a negative outlook toward COVID-19. Third, based on topic modeling, the themes relating to COVID-19 and the outbreak were divided into three categories: the COVID-19 pandemic emergency, how to control COVID-19, and reports on COVID-19. Conclusions Sentiment analysis and topic modeling can produce useful information about the trends in the discussion of the COVID-19 pandemic on social media as well as alternative perspectives to investigate the COVID-19 crisis, which has created considerable public awareness. This study shows that Twitter is a good communication channel for understanding both public concern and public awareness about COVID-19. These findings can help health departments communicate information to alleviate specific public concerns about the disease.

Download Full-text

Analysis of Health Research Topics in Indonesia Using the LDA (Latent Dirichlet Allocation) Topic Modeling Method

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v4i2.1821 ◽

2020 ◽

Vol 4 (2) ◽

pp. 336-344

Author(s):

Yoga Sahria ◽

Dhomas Hatta Fudholi

Keyword(s):

Health Research ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Research Trend ◽

The Public ◽

Know How ◽

The Government ◽

Python Programming ◽

Dirichlet Allocation

In this time, the need of research, the development and the implementation of the result of research in health is increasing both from the researchers, the government, the academic even of from the public general. One of the ways to find out the health research trend is by topic modeling. The method that used in this research is topic modeling LDA (Latent Dirichlet Allocation) method. The purpose of this research is to identify how modeling topic method LDA analyze modeling topic to some health research in Indonesia by Sinta Journal and to know how the coherence value in each topic of the model that has been made. Besides, hopefully it can be used as a reference to do heath research in Indonesia based the topic that has been modeled. The development of this research uses Anaconda3 Python Programming Language Tools and utilizes the LDA library that provided to get the topic model. To examine the result of this research the respondent are medical worker, health researcher and academics. The result of this research the topic modeling that used 94,1% respondent say very good and 5,9% say good.

Download Full-text

Innovation in an Emerging Market: A Bibliometric and Latent Dirichlet Allocation Based Topic Modeling Study

2020 International Conference on Decision Aid Sciences and Application (DASA) ◽

10.1109/dasa51403.2020.9317278 ◽

2020 ◽

Author(s):

Mohd Faiz Hilmi ◽

Yanti Mustapha ◽

Mohammad Tasyriq Che Omar

Keyword(s):

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Emerging Market ◽

Modeling Study ◽

Dirichlet Allocation

Download Full-text

Spam Diffusion in Social Networking Media using Latent Dirichlet Allocation

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i7898.1081219 ◽

2019 ◽

Vol 8 (12) ◽

pp. 881-885

Keyword(s):

Online Social Networks ◽

Topic Modeling ◽

Information Diffusion ◽

Latent Dirichlet Allocation ◽

Good Accuracy ◽

Ground Truth ◽

Online Social Media ◽

Diffusion Dynamics ◽

Dirichlet Allocation

Like web spam has been a major threat to almost every aspect of the current World Wide Web, similarly social spam especially in information diffusion has led a serious threat to the utilities of online social media. To combat this challenge the significance and impact of such entities and content should be analyzed critically. In order to address this issue, this work usedTwitter as a case study and modeled the contents of information through topic modeling and coupled it with the user oriented feature to deal it with a good accuracy. Latent Dirichlet Allocation (LDA) a widely used topic modeling technique is applied to capture the latent topics from the tweets’ documents. The major contribution of this work is twofold: constructing the dataset which serves as the ground-truth for analyzing the diffusion dynamics of spam/non-spam information and analyzing the effects of topics over the diffusibility. Exhaustive experiments clearly reveal the variation in topics shared by the spam and nonspam tweets. The rise in popularity of online social networks, not only attracts legitimate users but also the spammers. Legitimate users use the services of OSNs for a good purpose i.e., maintaining the relations with friends/colleagues, sharing the information of interest, increasing the reach of their business through advertisings

Download Full-text

Urban Crisis Detection Technique: A Spatial and Data Driven Approach Based on Latent Dirichlet Allocation (LDA) Topic Modeling

Construction Research Congress 2018 ◽

10.1061/9780784481271.025 ◽

2018 ◽

Cited By ~ 6

Author(s):

Yan Wang ◽

John E. Taylor

Keyword(s):

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Detection Technique ◽

Data Driven ◽

Urban Crisis ◽

Data Driven Approach ◽

Dirichlet Allocation

Download Full-text

Topic Modelling Twitter Data with Latent Dirichlet Allocation Method

2019 International Conference on Electrical Engineering and Computer Science (ICECOS) ◽

10.1109/icecos47637.2019.8984523 ◽

2019 ◽

Cited By ~ 1

Author(s):

Edi Surya Negara ◽

Dendi Triadi ◽

Ria Andryani

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Modelling ◽

Twitter Data ◽

Allocation Method ◽

Dirichlet Allocation

Download Full-text

The surveillance of a supreme audit institution on related party transactions

Journal of Public Budgeting Accounting & Financial Management ◽

10.1108/jpbafm-12-2019-0181 ◽

2020 ◽

Vol 32 (4) ◽

pp. 577-603

Author(s):

Gustavo Cesário ◽

Ricardo Lopes Cardoso ◽

Renato Santos Aranha

Keyword(s):

Public Sector ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Conflicts Of Interest ◽

International Standards ◽

Content Type ◽

Related Party Transactions ◽

The Public ◽

Audit Reports ◽

Dirichlet Allocation

PurposeThis paper aims to analyse how the supreme audit institution (SAI) monitors related party transactions (RPTs) in the Brazilian public sector. It considers definitions and disclosure policies of RPTs by international accounting and auditing standards and their evolution since 1980.Design/methodology/approachBased on archival research on international standards and using an interpretive approach, the authors investigated definitions and disclosure policies. Using a topic model based on latent Dirichlet allocation, the authors performed a content analysis on over 59,000 SAI decisions to assess how the SAI monitors RPTs.FindingsThe SAI investigates nepotism (a kind of RPT) and conflicts of interest up to eight times more frequently than related parties. Brazilian laws prevent nepotism and conflicts of interest, but not RPTs in general. Indeed, Brazilian public-sector accounting standards have not converged towards IPSAS 20, and ISSAI 1550 does not adjust auditing procedures to suit the public sector.Research limitations/implicationsThe SAI follows a legalistic auditing approach, indicating a need for regulation of related public-sector parties to improve surveillance. In addition to Brazil, other code law countries might face similar circumstances.Originality/valuePublic-sector RPTs are an under-investigated field, calling for attention by academics and standard-setters. Text mining and latent Dirichlet allocation, while mature techniques, are underexplored in accounting and auditing studies. Additionally, the Python script created to analyse the audit reports is available at Mendeley Data and may be used to perform similar analyses with minor adaptations.

Download Full-text

Topic modeling for expert finding using latent Dirichlet allocation

Wiley Interdisciplinary Reviews Data Mining and Knowledge Discovery ◽

10.1002/widm.1102 ◽

2013 ◽

Vol 3 (5) ◽

pp. 346-353 ◽

Cited By ~ 11

Author(s):

Saeedeh Momtazi ◽

Felix Naumann

Keyword(s):

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Expert Finding ◽

Dirichlet Allocation

Download Full-text