Analisis topik konten channel YouTube K-pop Indonesia menggunakan Latent Dirichlet Allocation

The development of digital technology has brought new media, one of which is Youtube, which is now one of the most widely used applications for internet users in the world. The growth of the audience which is known as viewers, is also suported by the contribution from the content creators or also known as YouTubers from Indonesian. The more the viewers grow, the more their demand for trend content are also grwoing at surprisingly speed in one of the topics which is H-pop. In this study, the author wanted to see the dominant topics that K-pop YouTubers often upload to support content creator. This research was conducted using the Latent Dirichlet Allocation method. The analysis was carried out on after using text mining on 2563 videos from 10 K-pop YouTuber accounts with more than 100,000 subscribers. To determine the optimal number of topics by looking at the value of perplexity and topic conherence. The results obtained are the top 5 topics that are the content material in the uploaded video. These topics include reactions to dance covers, unboxing on albums and conducting reviews, riddles from K-pop dances and vlogs together to discuss about covers and reactions to sounds on K-pop songs.

Download Full-text

Topic Modelling Twitter Data with Latent Dirichlet Allocation Method

2019 International Conference on Electrical Engineering and Computer Science (ICECOS) ◽

10.1109/icecos47637.2019.8984523 ◽

2019 ◽

Cited By ~ 1

Author(s):

Edi Surya Negara ◽

Dendi Triadi ◽

Ria Andryani

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Modelling ◽

Twitter Data ◽

Allocation Method ◽

Dirichlet Allocation

Download Full-text

Topic Predictions and Optimized Recommendation Mechanism Based on Integrated Topic Modeling and Deep Neural Networks in Crowdfunding Platforms

Applied Sciences ◽

10.3390/app9245496 ◽

2019 ◽

Vol 9 (24) ◽

pp. 5496 ◽

Cited By ~ 2

Author(s):

Wafa Shafqat ◽

Yung-Cheol Byun

Keyword(s):

Neural Networks ◽

Network Architecture ◽

Latent Dirichlet Allocation ◽

Performance Metrics ◽

Short Term Memory ◽

Language Modeling ◽

Optimal Number ◽

Internet Users ◽

Topic Distribution ◽

Baseline Algorithm

The accelerated growth rate of internet users and its applications, primarily e-business, has accustomed people to write their comments and reviews about the product they received. These reviews are remarkably competent to shape customers’ decisions. However, in crowdfunding, where investors finance innovative ideas in exchange for some rewards or products, the comments of investors are often ignored. These comments can play a markedly significant role in helping crowdfunding platforms to battle against the bitter challenge of fraudulent activities. We take advantage of the language modeling techniques and aim to merge them with neural networks to identify some hidden discussion patterns in the comments. Our objective is to design a language modeling based neural network architecture, where Recurrent Neural Networks (RNN) Long Short-Term Memory (LSTM) is used to predict discussion trends, i.e., either towards scam or non-scam. LSTM layers are fed with latent topic distribution learned from the pre-trained Latent Dirichlet Allocation (LDA) model. In order to optimize the recommendations, we used Particle Swarm Optimization (PSO) as a baseline algorithm. This module helps investors find secure projects to invest in (with the highest chances of delivery) within their preferred categories. We used prediction accuracy, an optimal number of identified topics, and the number of epochs, as metrics of performance evaluation for the proposed approach. We compared our results with simple Neural Networks (NNs) and NN-LDA based on these performance metrics. The strengths of both integrated models suggest that the proposed model can play a substantial role in a better understanding of crowdfunding comments.

Download Full-text

Rola mediów w kształtowaniu tożsamości współczesnego człowieka

Kwartalnik Kolegium Ekonomiczno-Społecznego Studia i Prace ◽

10.33119/kkessip.2015.4.6 ◽

2015 ◽

pp. 159-175

Author(s):

Małgorzata Molęda-Zdziech

Keyword(s):

New Media ◽

Ideal Type ◽

Research Report ◽

Individual Identity ◽

Internet Users ◽

The World ◽

Identity Based ◽

The Individual ◽

The Ideal

The aim of the study is to analyse the role of media in shaping of the modern man identity. I narrow my approach to the postmodern approaches of A. Giddens, M. Castells and M. Maffesoli. Those authors combine in their work changes taking place in the world of media and changes on the level of the individual identity. Based on the work of M. Maffesoli, I reconstruct the ideal type of postmodern individual identity – homo creator. Then, I describe the mediality as postmodern value and a component of postmodern identity. The study presents the results of a 2014 TNS Connected Life research report prepared on a sample of 55,000 Internet users from around the World. The results illustrate the habits in the use of traditional and new media.

Download Full-text

Applying Text Mining, Clustering Analysis, and Latent Dirichlet Allocation Techniques for Topic Classification of Environmental Education Journals

Sustainability ◽

10.3390/su131910856 ◽

2021 ◽

Vol 13 (19) ◽

pp. 10856

Author(s):

I-Cheng Chang ◽

Tai-Kuei Yu ◽

Yu-Jie Chang ◽

Tai-Yi Yu

Keyword(s):

Artificial Intelligence ◽

Cluster Analysis ◽

Text Mining ◽

Environmental Education ◽

Hierarchical Clustering ◽

Language Processing ◽

Latent Dirichlet Allocation ◽

Word Analysis ◽

Dirichlet Allocation

Facing the big data wave, this study applied artificial intelligence to cite knowledge and find a feasible process to play a crucial role in supplying innovative value in environmental education. Intelligence agents of artificial intelligence and natural language processing (NLP) are two key areas leading the trend in artificial intelligence; this research adopted NLP to analyze the research topics of environmental education research journals in the Web of Science (WoS) database during 2011–2020 and interpret the categories and characteristics of abstracts for environmental education papers. The corpus data were selected from abstracts and keywords of research journal papers, which were analyzed with text mining, cluster analysis, latent Dirichlet allocation (LDA), and co-word analysis methods. The decisions regarding the classification of feature words were determined and reviewed by domain experts, and the associated TF-IDF weights were calculated for the following cluster analysis, which involved a combination of hierarchical clustering and K-means analysis. The hierarchical clustering and LDA decided the number of required categories as seven, and the K-means cluster analysis classified the overall documents into seven categories. This study utilized co-word analysis to check the suitability of the K-means classification, analyzed the terms with high TF-IDF wights for distinct K-means groups, and examined the terms for different topics with the LDA technique. A comparison of the results demonstrated that most categories that were recognized with K-means and LDA methods were the same and shared similar words; however, two categories had slight differences. The involvement of field experts assisted with the consistency and correctness of the classified topics and documents.

Download Full-text

Sentiment Mining from Online Patient Experience using Latent Dirichlet Allocation Method

Indian Journal of Science and Technology ◽

10.17485/ijst/2016/v9i19/93876 ◽

2016 ◽

Vol 9 (19) ◽

Author(s):

P. Padmavathy ◽

A. Anny Leema

Keyword(s):

Patient Experience ◽

Latent Dirichlet Allocation ◽

Allocation Method ◽

Sentiment Mining ◽

Dirichlet Allocation

Download Full-text

The Medical Service Customer’s Satisfaction Factors Extracted from Online Hospital Review Data Using Latent Dirichlet Allocation Method

Journal of Korea Service Management Society ◽

10.15706/jksms.2017.18.5.002 ◽

2017 ◽

Vol 18 (5) ◽

pp. 23-44 ◽

Cited By ~ 1

Author(s):

이시환 ◽

Joaram ◽

Lee, Hoonyoung

Keyword(s):

Latent Dirichlet Allocation ◽

Medical Service ◽

Allocation Method ◽

Dirichlet Allocation ◽

Satisfaction Factors

Download Full-text

A Comparative Automated Text Analysis of Airbnb Reviews in Hong Kong and Singapore Using Latent Dirichlet Allocation

Sustainability ◽

10.3390/su12166673 ◽

2020 ◽

Vol 12 (16) ◽

pp. 6673 ◽

Cited By ~ 1

Author(s):

Kiattipoom Kiatkawsin ◽

Ian Sutherland ◽

Jin-Young Kim

Keyword(s):

Hong Kong ◽

Text Analysis ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Optimal Number ◽

Text Data ◽

Customer Reviews ◽

Gaining Insight ◽

Dirichlet Allocation

Airbnb has emerged as a platform where unique accommodation options can be found. Due to the uniqueness of each accommodation unit and host combination, each listing offers a one-of-a-kind experience. As consumers increasingly rely on text reviews of other customers, managers are also increasingly gaining insight from customer reviews. Thus, this present study aimed to extract those insights from reviews using latent Dirichlet allocation, an unsupervised type of topic modeling that extracts latent discussion topics from text data. Findings of Hong Kong’s 185,695 and Singapore’s 93,571 Airbnb reviews, two long-term rival destinations, were compared. Hong Kong produced 12 total topics that can be categorized into four distinct groups whereas Singapore’s optimal number of topics was only five. Topics produced from both destinations covered the same range of attributes, but Hong Kong’s 12 topics provide a greater degree of precision to formulate managerial recommendations. While many topics are similar to established hotel attributes, topics related to the host and listing management are unique to the Airbnb experience. The findings also revealed keywords used when evaluating the experience that provide more insight beyond typical numeric ratings.

Download Full-text

An overview of literature on COVID-19, MERS and SARS: Using text mining and latent Dirichlet allocation

Journal of Information Science ◽

10.1177/0165551520954674 ◽

2020 ◽

pp. 016555152095467

Author(s):

Xian Cheng ◽

Qiang Cao ◽

Stephen Shaoyi Liao

Keyword(s):

Text Mining ◽

Latent Dirichlet Allocation ◽

Information Science ◽

Medical Community ◽

Main Research ◽

Mining Technique ◽

Research Theme ◽

Research Themes ◽

Similarity And Difference ◽

Dirichlet Allocation

The unprecedented outbreak of COVID-19 is one of the most serious global threats to public health in this century. During this crisis, specialists in information science could play key roles to support the efforts of scientists in the health and medical community for combatting COVID-19. In this article, we demonstrate that information specialists can support health and medical community by applying text mining technique with latent Dirichlet allocation procedure to perform an overview of a mass of coronavirus literature. This overview presents the generic research themes of the coronavirus diseases: COVID-19, MERS and SARS, reveals the representative literature per main research theme and displays a network visualisation to explore the overlapping, similarity and difference among these themes. The overview can help the health and medical communities to extract useful information and interrelationships from coronavirus-related studies.

Download Full-text

Business intelligence in banking: A literature analysis from 2002 to 2013 using text mining and latent Dirichlet allocation

Expert Systems with Applications ◽

10.1016/j.eswa.2014.09.024 ◽

2015 ◽

Vol 42 (3) ◽

pp. 1314-1324 ◽

Cited By ~ 122

Author(s):

Sérgio Moro ◽

Paulo Cortez ◽

Paulo Rita

Keyword(s):

Text Mining ◽

Business Intelligence ◽

Latent Dirichlet Allocation ◽

Literature Analysis ◽

Dirichlet Allocation

Download Full-text

Optimisation towards Latent Dirichlet Allocation: Its Topic Number and Collapsed Gibbs Sampling Inference Process

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v8i5.pp3204-3213 ◽

2018 ◽

Vol 8 (5) ◽

pp. 3204 ◽

Cited By ~ 1

Author(s):

Bambang Subeno ◽

Retno Kusumaningrum ◽

Farikhin Farikhin

Keyword(s):

Maximum Likelihood ◽

Latent Dirichlet Allocation ◽

Minimum Description Length ◽

Probability Model ◽

Optimal Number ◽

Classification Model ◽

Training Models ◽

Average Accuracy ◽

Collapsed Gibbs Sampling ◽

Dirichlet Allocation

<span lang="EN-GB">Latent Dirichlet Allocation (LDA) is a probability model for grouping hidden topics in documents by the number of predefined topics. If conducted incorrectly, determining the amount of K topics will result in limited word correlation with topics. Too large or too small number of K topics causes inaccuracies in grouping topics in the formation of training models. This study aims to determine the optimal number of corpus topics in the LDA method using the maximum likelihood and Minimum Description Length (MDL) approach. The experimental process uses Indonesian news articles with the number of documents at 25, 50, 90, and 600; in each document, the numbers of words are 3898, 7760, 13005, and 4365. The results show that the maximum likelihood and MDL approach result in the same number of optimal topics. The optimal number of topics is influenced by alpha and beta parameters. In addition, the number of documents does not affect the computation times but the number of words does. Computational times for each of those datasets are 2.9721, 6.49637, 13.2967, and 3.7152 seconds. The optimisation model has resulted in many LDA topics as a classification model. This experiment shows that the highest average accuracy is 61% with alpha 0.1 and beta 0.001.</span>

Download Full-text