An overview of literature on COVID-19, MERS and SARS: Using text mining and latent Dirichlet allocation

The unprecedented outbreak of COVID-19 is one of the most serious global threats to public health in this century. During this crisis, specialists in information science could play key roles to support the efforts of scientists in the health and medical community for combatting COVID-19. In this article, we demonstrate that information specialists can support health and medical community by applying text mining technique with latent Dirichlet allocation procedure to perform an overview of a mass of coronavirus literature. This overview presents the generic research themes of the coronavirus diseases: COVID-19, MERS and SARS, reveals the representative literature per main research theme and displays a network visualisation to explore the overlapping, similarity and difference among these themes. The overview can help the health and medical communities to extract useful information and interrelationships from coronavirus-related studies.

Download Full-text

Looking beyond the stars: A description of text mining technique to extract latent dimensions from online product reviews

International Journal of Market Research ◽

10.1177/1470785319863619 ◽

2019 ◽

Vol 62 (2) ◽

pp. 195-215

Author(s):

Frederik Situmeang ◽

Nelleke de Boer ◽

Austin Zhang

Keyword(s):

Text Mining ◽

Customer Satisfaction ◽

Research Methodology ◽

Latent Dirichlet Allocation ◽

Online Reviews ◽

Product Reviews ◽

Marketing Literature ◽

Mining Technique ◽

Online Product Reviews ◽

The Relationship

The purpose of this study is to contribute to the marketing literature and practice by describing a research methodology to identify latent dimensions of customer satisfaction in product reviews, and examining the relationship between these attributes and customer satisfaction. Previous research in product reviews has largely relied only on quantitative ratings, either stars or review score. Advanced techniques for text mining provide the opportunity to extract meaning from customer online reviews. By analyzing 51,110 online reviews for 1,610 restaurants via latent Dirichlet allocation, this study uncovers 30 latent dimensions that are determinants of customer satisfaction. Furthermore, this study developed measurements of sentiment and innovativeness as moderators of the effect of these latent attributes to satisfaction.

Download Full-text

Applying Text Mining, Clustering Analysis, and Latent Dirichlet Allocation Techniques for Topic Classification of Environmental Education Journals

Sustainability ◽

10.3390/su131910856 ◽

2021 ◽

Vol 13 (19) ◽

pp. 10856

Author(s):

I-Cheng Chang ◽

Tai-Kuei Yu ◽

Yu-Jie Chang ◽

Tai-Yi Yu

Keyword(s):

Artificial Intelligence ◽

Cluster Analysis ◽

Text Mining ◽

Environmental Education ◽

Hierarchical Clustering ◽

Language Processing ◽

Latent Dirichlet Allocation ◽

Word Analysis ◽

Dirichlet Allocation

Facing the big data wave, this study applied artificial intelligence to cite knowledge and find a feasible process to play a crucial role in supplying innovative value in environmental education. Intelligence agents of artificial intelligence and natural language processing (NLP) are two key areas leading the trend in artificial intelligence; this research adopted NLP to analyze the research topics of environmental education research journals in the Web of Science (WoS) database during 2011–2020 and interpret the categories and characteristics of abstracts for environmental education papers. The corpus data were selected from abstracts and keywords of research journal papers, which were analyzed with text mining, cluster analysis, latent Dirichlet allocation (LDA), and co-word analysis methods. The decisions regarding the classification of feature words were determined and reviewed by domain experts, and the associated TF-IDF weights were calculated for the following cluster analysis, which involved a combination of hierarchical clustering and K-means analysis. The hierarchical clustering and LDA decided the number of required categories as seven, and the K-means cluster analysis classified the overall documents into seven categories. This study utilized co-word analysis to check the suitability of the K-means classification, analyzed the terms with high TF-IDF wights for distinct K-means groups, and examined the terms for different topics with the LDA technique. A comparison of the results demonstrated that most categories that were recognized with K-means and LDA methods were the same and shared similar words; however, two categories had slight differences. The involvement of field experts assisted with the consistency and correctness of the classified topics and documents.

Download Full-text

Analisis topik konten channel YouTube K-pop Indonesia menggunakan Latent Dirichlet Allocation

Teknologi ◽

10.26594/teknologi.v11i1.2155 ◽

2021 ◽

Vol 11 (1) ◽

pp. 16-25

Author(s):

Alfrida Rahmawati ◽

◽

Najla Lailin Nikmah ◽

Reynaldi Drajat Ageng Perwira ◽

Nur Aini Rakhmawati ◽

...

Keyword(s):

Text Mining ◽

New Media ◽

Digital Technology ◽

Latent Dirichlet Allocation ◽

Optimal Number ◽

Allocation Method ◽

Internet Users ◽

The World ◽

Dirichlet Allocation

The development of digital technology has brought new media, one of which is Youtube, which is now one of the most widely used applications for internet users in the world. The growth of the audience which is known as viewers, is also suported by the contribution from the content creators or also known as YouTubers from Indonesian. The more the viewers grow, the more their demand for trend content are also grwoing at surprisingly speed in one of the topics which is H-pop. In this study, the author wanted to see the dominant topics that K-pop YouTubers often upload to support content creator. This research was conducted using the Latent Dirichlet Allocation method. The analysis was carried out on after using text mining on 2563 videos from 10 K-pop YouTuber accounts with more than 100,000 subscribers. To determine the optimal number of topics by looking at the value of perplexity and topic conherence. The results obtained are the top 5 topics that are the content material in the uploaded video. These topics include reactions to dance covers, unboxing on albums and conducting reviews, riddles from K-pop dances and vlogs together to discuss about covers and reactions to sounds on K-pop songs.

Download Full-text

Business intelligence in banking: A literature analysis from 2002 to 2013 using text mining and latent Dirichlet allocation

Expert Systems with Applications ◽

10.1016/j.eswa.2014.09.024 ◽

2015 ◽

Vol 42 (3) ◽

pp. 1314-1324 ◽

Cited By ~ 122

Author(s):

Sérgio Moro ◽

Paulo Cortez ◽

Paulo Rita

Keyword(s):

Text Mining ◽

Business Intelligence ◽

Latent Dirichlet Allocation ◽

Literature Analysis ◽

Dirichlet Allocation

Download Full-text

Análisis de la evolución temática de la investigación sobre Información y Documentación en español en la base de datos LISA mediante modelado temático (1978-2019)

El profesional de la información ◽

10.3145/epi.2020.jul.27 ◽

2020 ◽

Cited By ~ 1

Author(s):

Francisco-Javier García-Marco ◽

Garlos G. Figuerola ◽

María Pinto

Keyword(s):

Information Technologies ◽

Latent Dirichlet Allocation ◽

Information Science ◽

Library And Information Science ◽

International Context ◽

Global Trends ◽

International Trends ◽

Editorial Decisions ◽

Selection Of ◽

Dirichlet Allocation

The thematic evolution of LIS research in Spanish between 1978 and 2019 is analyzed within the international context. To this end, relevant bibliographic references were retrieved from the Library and Information Science Abstracts (LISA) database, and their titles and abstracts were treated using the latent Dirichlet allocation (LDA) method, a statistical thematic modeling technique. Nineteen thematic sets were found and analyzed, labeled, and systematized in four main areas: processes, information technologies, libraries, and specialized documentations. Next, the results in Spanish were compared with international results obtained previously using the same methodology. In conclusion, LIS literature in Spanish mainly follows the international trends: during the last 50 years, the thematic focus of research has shifted from libraries and informational organizations to users and the development of specific systems and solutions. However, LIS research in Spanish also presents distinct characteristics: the importance of bibliometric research and biomedical documentation; research in the library area; and a certain delay in addressing technological, legal, and educational aspects. Although the selection of LISA articles depends on editorial decisions, the application of LDA to the peer-reviewed literature in Spanish provided results that are consistent with the international global trends, studies on other similar sources, and overall the state of the art. Resumen Se analiza dentro del contexto internacional la evolución temática de la investigación sobre Información y Documentación en español entre 1978 y 2019. Para ello se recuperaron las referencias bibliográficas relevantes de la base de datos LISA (Library and information science abstracts); y sus títulos y resúmenes fueron tratados con el método Latent Dirichlet Allocation (LDA), una técnica estadística de modelado temático. Se hallaron 19 conjuntos temáticos que fueron analizados, etiquetados y sistematizados en cuatro grandes áreas: procesos, tecnologías de la información, bibliotecas y documentaciones especializadas. Seguidamente se compararon los resultados en español con los resultados obtenidos anteriormente con la misma metodología y sobre la misma fuente a nivel internacional. En conclusión, la bibliografía LIS en español sigue en su conjunto las tendencias internacionales: el foco temático de la investigación ha variado en los últimos 50 años desde las bibliotecas y las organizaciones informacionales a los usuarios y el desarrollo de sistemas y soluciones específicas. Sin embargo, presenta también ciertas características propias: la importancia de la investigación bibliométrica y en documentación biomédica; el interés sostenido por la investigación netamente bibliotecaria; y un cierto retraso en el abordaje de los aspectos tecnológicos, legales y educativos. Aunque la selección de artículos de LISA depende de decisiones editoriales, la aplicación del LDA a la bibliografía en español ha ofrecido resultados consistentes con el estudio de tendencias globales internacionales, estudios sobre otras fuentes semejantes y las observaciones de investigaciones anteriores.

Download Full-text

A Narrative Analysis by Text Mining Technique Using Key Graph: Similarity and Difference of a View of Oral Health and Oral Risk Cognition between Japanese Living People and Dentists

2014 IEEE International Conference on Data Mining Workshop ◽

10.1109/icdmw.2014.86 ◽

2014 ◽

Author(s):

Fukiko Kobayashi ◽

Yumiko Nara

Keyword(s):

Oral Health ◽

Text Mining ◽

Narrative Analysis ◽

Graph Similarity ◽

Mining Technique ◽

Similarity And Difference

Download Full-text

Implementasi LDA untuk Pengelompokan Topik Tweet Akun Bot Twitter bertagar #covid-19

CogITo Smart Journal ◽

10.31154/cogito.v7i1.299.170-181 ◽

2021 ◽

Vol 7 (1) ◽

pp. 170

Author(s):

Muhammad Alif Noor Febriansyach ◽

Faza Rashif ◽

Goldio Ihza Perwira Nirvana ◽

Nur Aini Rakhmawati

Keyword(s):

Text Mining ◽

Latent Dirichlet Allocation ◽

Dirichlet Allocation

Twitter merupakan media sosial yang sedang mengalami perkembangan yang pesat, karena pengguna dapat berinteraksi satu sama lain menggunakan media komputer atau perangkat mobile. Perubahan tagar trending yang berubah dengan cepat sesuai sesuai dengan intensitas pengguna membicarakan hal tertentu. Sehingga media social twitter ini cocok untuk merumpi membicarakan hal-hal terkini, salah satunya masalah COVID-19. Hal ini tidak menutup kemungkinan ada oknum yang menggunakan predikat ini untuk membuat berita untuk menggiring opini public mengenai COVID-19 mengenai berita baik maupun berita yang tak bersumber yang dapat menyebar dengan cepat. Pada penelitian ini penulis ingin mengetahui macam-macam topik yang dibahas oleh akun bot terhadap penyebaran informasi menggunakan tagar #covid19. Penelitian ini dilakukan dengan menggunakan metode Latent Dirichlet Allocation (LDA ). Analisis dilakukan setelah melakukan text mining pada 162 Tweet dari 62 akun bot Twitter. Untuk menentukan jumlah topik yang optimal, yakni dengan melihat nilai perplexity dan topik coherence. Hasil yang didapatkan adalah 5 topik teratas antara lain tentang kondisi dan dampak pandemi saat ini, himbauan untuk menjaga jarak agar Kesehatan tetap terjaga, perkembangan penyebaran COVID 19 yang ada di Indonesia, vaksinasi yang terjadi di beberapa wilayah di Indonesia, dan cara menghadapi COVID-19.Kata kunci—Covid-19, Twitter, Akun Bot, LDA

Download Full-text

Discovery of factors affecting tourists' fine dining experiences at five-star hotel restaurants in Istanbul

British Food Journal ◽

10.1108/bfj-02-2021-0138 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Semra Aktas-Polat ◽

Serkan Polat

Keyword(s):

Text Mining ◽

Food Quality ◽

Design Methodology ◽

Latent Dirichlet Allocation ◽

Content Type ◽

Factors Affecting ◽

Customer Delight ◽

Dirichlet Allocation

PurposeThe purpose of this study is to discover the factors affecting customer delight, satisfaction and dissatisfaction in fine dining experiences (FDEs).Design/methodology/approachOnline user generated 2,585 reviews on TripAdvisor for 46 five-star hotel restaurants operating in Istanbul were analyzed with the latent Dirichlet allocation (LDA) algorithm.FindingsLDA created nine, eight and seven topics for delight, satisfaction and dissatisfaction, respectively. The most salient topics for customer delight, satisfaction and dissatisfaction in FDEs are staff (17.3%), view (19%), and food quality (23%), respectively.Originality/valueThis study is one of the few studies investigating customer delight and satisfaction together. The study shows that FDEs can be analyzed with text mining techniques. Moreover, the study contributes to the literature on customer delight by adding staff topic as an antecedent.

Download Full-text