Identifying patterns in students’ scientific argumentation: content analysis through text mining using Latent Dirichlet Allocation

2020 ◽  
Vol 68 (5) ◽  
pp. 2185-2214
Author(s):  
Wanli Xing ◽  
Hee-Sun Lee ◽  
Antonette Shibani
2021 ◽  
Vol 13 (19) ◽  
pp. 10856
Author(s):  
I-Cheng Chang ◽  
Tai-Kuei Yu ◽  
Yu-Jie Chang ◽  
Tai-Yi Yu

Facing the big data wave, this study applied artificial intelligence to cite knowledge and find a feasible process to play a crucial role in supplying innovative value in environmental education. Intelligence agents of artificial intelligence and natural language processing (NLP) are two key areas leading the trend in artificial intelligence; this research adopted NLP to analyze the research topics of environmental education research journals in the Web of Science (WoS) database during 2011–2020 and interpret the categories and characteristics of abstracts for environmental education papers. The corpus data were selected from abstracts and keywords of research journal papers, which were analyzed with text mining, cluster analysis, latent Dirichlet allocation (LDA), and co-word analysis methods. The decisions regarding the classification of feature words were determined and reviewed by domain experts, and the associated TF-IDF weights were calculated for the following cluster analysis, which involved a combination of hierarchical clustering and K-means analysis. The hierarchical clustering and LDA decided the number of required categories as seven, and the K-means cluster analysis classified the overall documents into seven categories. This study utilized co-word analysis to check the suitability of the K-means classification, analyzed the terms with high TF-IDF wights for distinct K-means groups, and examined the terms for different topics with the LDA technique. A comparison of the results demonstrated that most categories that were recognized with K-means and LDA methods were the same and shared similar words; however, two categories had slight differences. The involvement of field experts assisted with the consistency and correctness of the classified topics and documents.


2021 ◽  
Author(s):  
Shelli Kesler ◽  
Ashley Henneghan ◽  
Whitney Thurman ◽  
Vikram Rao

BACKGROUND Cancer-related cognitive impairment is a common and significant adverse effect of cancer and its therapies. However, its definition and assessment remain difficult due to limitations of currently available measurement tools. OBJECTIVE The aim of this study was to provide proof-of-concept for using natural language to examine cognitive effects of cancer. METHODS We applied Latent Dirichlet Allocation, a topic modeling approach, to 145 public online comments related to cognitive effects of cancer. We supplemented this method with a qualitative content analysis. RESULTS Latent Dirichlet Allocation revealed two latent topics that we qualitatively interpreted as representing internal and external factors related to cognitive effects. These findings lead us to hypothesize regarding the potential contribution of locus of control to cancer-related cognitive impairment. Qualitative content analysis suggested several major themes including symptoms, emotional/psychological impacts, coping, “chemobrain” is real, change over time, and function. CONCLUSIONS Our findings indicate that topic modeling of free text responses may be a valuable approach for hypothesis generation in the study of cancer-related cognitive impairment. Future directions in this field include prospective acquisition of free text responses for both subjective and objective assessment of cognitive function in patients with cancer.


Teknologi ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 16-25
Author(s):  
Alfrida Rahmawati ◽  
◽  
Najla Lailin Nikmah ◽  
Reynaldi Drajat Ageng Perwira ◽  
Nur Aini Rakhmawati ◽  
...  

The development of digital technology has brought new media, one of which is Youtube, which is now one of the most widely used applications for internet users in the world. The growth of the audience which is known as viewers, is also suported by the contribution from the content creators or also known as YouTubers from Indonesian. The more the viewers grow, the more their demand for trend content are also grwoing at surprisingly speed in one of the topics which is H-pop. In this study, the author wanted to see the dominant topics that K-pop YouTubers often upload to support content creator. This research was conducted using the Latent Dirichlet Allocation method. The analysis was carried out on after using text mining on 2563 videos from 10 K-pop YouTuber accounts with more than 100,000 subscribers. To determine the optimal number of topics by looking at the value of perplexity and topic conherence. The results obtained are the top 5 topics that are the content material in the uploaded video. These topics include reactions to dance covers, unboxing on albums and conducting reviews, riddles from K-pop dances and vlogs together to discuss about covers and reactions to sounds on K-pop songs.


2020 ◽  
pp. 016555152095467
Author(s):  
Xian Cheng ◽  
Qiang Cao ◽  
Stephen Shaoyi Liao

The unprecedented outbreak of COVID-19 is one of the most serious global threats to public health in this century. During this crisis, specialists in information science could play key roles to support the efforts of scientists in the health and medical community for combatting COVID-19. In this article, we demonstrate that information specialists can support health and medical community by applying text mining technique with latent Dirichlet allocation procedure to perform an overview of a mass of coronavirus literature. This overview presents the generic research themes of the coronavirus diseases: COVID-19, MERS and SARS, reveals the representative literature per main research theme and displays a network visualisation to explore the overlapping, similarity and difference among these themes. The overview can help the health and medical communities to extract useful information and interrelationships from coronavirus-related studies.


2021 ◽  
Vol 7 (1) ◽  
pp. 170
Author(s):  
Muhammad Alif Noor Febriansyach ◽  
Faza Rashif ◽  
Goldio Ihza Perwira Nirvana ◽  
Nur Aini Rakhmawati

Twitter merupakan media sosial yang sedang mengalami perkembangan yang pesat, karena pengguna dapat berinteraksi satu sama lain menggunakan media komputer atau perangkat mobile. Perubahan tagar trending  yang berubah dengan cepat sesuai sesuai dengan intensitas pengguna membicarakan hal tertentu. Sehingga media social twitter ini cocok untuk merumpi membicarakan hal-hal terkini, salah satunya masalah COVID-19. Hal ini tidak menutup kemungkinan ada oknum yang menggunakan predikat ini untuk membuat berita untuk menggiring opini public mengenai COVID-19 mengenai berita baik maupun berita yang tak bersumber yang dapat menyebar dengan cepat. Pada penelitian ini penulis ingin mengetahui macam-macam topik yang dibahas oleh  akun bot terhadap penyebaran informasi menggunakan tagar #covid19. Penelitian ini dilakukan dengan menggunakan metode Latent  Dirichlet  Allocation  (LDA ). Analisis dilakukan setelah melakukan text mining pada 162 Tweet dari 62 akun bot Twitter. Untuk menentukan jumlah topik yang optimal, yakni dengan melihat nilai perplexity dan topik coherence. Hasil yang didapatkan adalah  5 topik teratas antara lain tentang kondisi dan dampak pandemi saat ini, himbauan untuk menjaga jarak agar Kesehatan tetap terjaga, perkembangan penyebaran COVID 19 yang ada di Indonesia, vaksinasi yang terjadi di beberapa wilayah di Indonesia, dan cara menghadapi COVID-19.Kata kunci—Covid-19, Twitter, Akun Bot, LDA


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Semra Aktas-Polat ◽  
Serkan Polat

PurposeThe purpose of this study is to discover the factors affecting customer delight, satisfaction and dissatisfaction in fine dining experiences (FDEs).Design/methodology/approachOnline user generated 2,585 reviews on TripAdvisor for 46 five-star hotel restaurants operating in Istanbul were analyzed with the latent Dirichlet allocation (LDA) algorithm.FindingsLDA created nine, eight and seven topics for delight, satisfaction and dissatisfaction, respectively. The most salient topics for customer delight, satisfaction and dissatisfaction in FDEs are staff (17.3%), view (19%), and food quality (23%), respectively.Originality/valueThis study is one of the few studies investigating customer delight and satisfaction together. The study shows that FDEs can be analyzed with text mining techniques. Moreover, the study contributes to the literature on customer delight by adding staff topic as an antecedent.


2019 ◽  
Vol 36 (5) ◽  
pp. 655-665 ◽  
Author(s):  
Jurui Zhang

Purpose This paper aims to investigate customers’ experiences with Airbnb by text-mining customer reviews posted on the platform and comparing the extracted topics from online reviews between Airbnb and the traditional hotel industry using topic modeling. Design/methodology/approach This research uses text-mining approaches, including content analysis and topic modeling (latent Dirichlet allocation method), to examine 1,026,988 Airbnb guest reviews of 50,933 listings in seven cities in the USA. Findings The content analysis shows that negative reviews are more authentic and credible than positive reviews on Airbnb and that the occurrence of social words is positively related to positive emotion in reviews, but negatively related to negative emotion in reviews. A comparison of reviews on Airbnb and hotel reviews shows unique topics on Airbnb, namely, “late check-in”, “patio and deck view”, “food in kitchen”, “help from host”, “door lock/key”, “sleep/bed condition” and “host response”. Research limitations/implications The topic modeling result suggests that Airbnb guests want to get to know and connect with the local community; thus, help from hosts on ways they can authentically experience the local community would be beneficial. In addition, the results suggest that customers emphasize their interaction with hosts; thus, to improve customer satisfaction, Airbnb hosts should interact with guests and respond to guests’ inquiries quickly. Practical implications Hotel managers should design marketing programs that fulfill customers’ desire for authentic and local experiences. The results also suggest that peer-to-peer accommodation platforms should improve online review systems to facilitate authentic reviews and help guests have a smooth check-in process. Originality/value This study is one of the first to examine consumer reviews in detail in the sharing economy and compare topics from consumer reviews between Airbnb and hotels.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Farshid Danesh ◽  
Meisam Dastani ◽  
Mohammad Ghorbani

PurposeThe present article's primary purpose is the topic modeling of the global coronavirus publications in the last 50 years.Design/methodology/approachThe present study is applied research that has been conducted using text mining. The statistical population is the coronavirus publications that have been collected from the Web of Science Core Collection (1970–2020). The main keywords were extracted from the Medical Subject Heading browser to design the search strategy. Latent Dirichlet allocation and Python programming language were applied to analyze the data and implement the text mining algorithms of topic modeling.FindingsThe findings indicated that the SARS, science, protein, MERS, veterinary, cell, human, RNA, medicine and virology are the most important keywords in the global coronavirus publications. Also, eight important topics were identified in the global coronavirus publications by implementing the topic modeling algorithm. The highest number of publications were respectively on the following topics: “structure and proteomics,” “Cell signaling and immune response,” “clinical presentation and detection,” “Gene sequence and genomics,” “Diagnosis tests,” “vaccine and immune response and outbreak,” “Epidemiology and Transmission” and “gastrointestinal tissue.”Originality/valueThe originality of this article can be considered in three ways. First, text mining and Latent Dirichlet allocation were applied to analyzing coronavirus literature for the first time. Second, coronavirus is mentioned as a hot topic of research. Finally, in addition to the retrospective approaches to 50 years of data collection and analysis, the results can be exploited with prospective approaches to strategic planning and macro-policymaking.


Sign in / Sign up

Export Citation Format

Share Document