scholarly journals Analysis of Health Research Topics in Indonesia Using the LDA (Latent Dirichlet Allocation) Topic Modeling Method

2020 ◽  
Vol 4 (2) ◽  
pp. 336-344
Author(s):  
Yoga Sahria ◽  
Dhomas Hatta Fudholi

In this time, the need of research, the development and the implementation of the result of research in health is increasing both from the researchers, the government, the academic even of from the public general. One of the ways to find out the health research trend is by topic modeling. The method that used in this research is topic modeling LDA (Latent Dirichlet Allocation) method. The purpose of this research is to identify how modeling topic method LDA analyze modeling topic to some health research in Indonesia by Sinta Journal and to know how the coherence value in each topic of the model that has been made. Besides, hopefully it can be used as a reference to do heath research in Indonesia based the topic that has been modeled. The development of this research uses Anaconda3 Python Programming Language Tools and utilizes the LDA library that provided to get the topic model. To examine the result of this research the respondent are medical worker, health researcher and academics. The result of this research the topic  modeling that used 94,1% respondent say very good and 5,9% say good.

2020 ◽  
Vol 32 (4) ◽  
pp. 577-603
Author(s):  
Gustavo Cesário ◽  
Ricardo Lopes Cardoso ◽  
Renato Santos Aranha

PurposeThis paper aims to analyse how the supreme audit institution (SAI) monitors related party transactions (RPTs) in the Brazilian public sector. It considers definitions and disclosure policies of RPTs by international accounting and auditing standards and their evolution since 1980.Design/methodology/approachBased on archival research on international standards and using an interpretive approach, the authors investigated definitions and disclosure policies. Using a topic model based on latent Dirichlet allocation, the authors performed a content analysis on over 59,000 SAI decisions to assess how the SAI monitors RPTs.FindingsThe SAI investigates nepotism (a kind of RPT) and conflicts of interest up to eight times more frequently than related parties. Brazilian laws prevent nepotism and conflicts of interest, but not RPTs in general. Indeed, Brazilian public-sector accounting standards have not converged towards IPSAS 20, and ISSAI 1550 does not adjust auditing procedures to suit the public sector.Research limitations/implicationsThe SAI follows a legalistic auditing approach, indicating a need for regulation of related public-sector parties to improve surveillance. In addition to Brazil, other code law countries might face similar circumstances.Originality/valuePublic-sector RPTs are an under-investigated field, calling for attention by academics and standard-setters. Text mining and latent Dirichlet allocation, while mature techniques, are underexplored in accounting and auditing studies. Additionally, the Python script created to analyse the audit reports is available at Mendeley Data and may be used to perform similar analyses with minor adaptations.


Author(s):  
Carlo Schwarz

In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.


2019 ◽  
Vol 15 (1) ◽  
pp. 83-102 ◽  
Author(s):  
Ahmed Amir Tazibt ◽  
Farida Aoughlis

Purpose During crises such as accidents or disasters, an enormous volume of information is generated on the Web. Both people and decision-makers often need to identify relevant and timely content that can help in understanding what happens and take right decisions, as soon it appears online. However, relevant content can be disseminated in document streams. The available information can also contain redundant content published by different sources. Therefore, the need of automatic construction of summaries that aggregate important, non-redundant and non-outdated pieces of information is becoming critical. Design/methodology/approach The aim of this paper is to present a new temporal summarization approach based on a popular topic model in the information retrieval field, the Latent Dirichlet Allocation. The approach consists of filtering documents over streams, extracting relevant parts of information and then using topic modeling to reveal their underlying aspects to extract the most relevant and novel pieces of information to be added to the summary. Findings The performance evaluation of the proposed temporal summarization approach based on Latent Dirichlet Allocation, performed on the TREC Temporal Summarization 2014 framework, clearly demonstrates its effectiveness to provide short and precise summaries of events. Originality/value Unlike most of the state of the art approaches, the proposed method determines the importance of the pieces of information to be added to the summaries solely relying on their representation in the topic space provided by Latent Dirichlet Allocation, without the use of any external source of evidence.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Farshid Danesh ◽  
Meisam Dastani ◽  
Mohammad Ghorbani

PurposeThe present article's primary purpose is the topic modeling of the global coronavirus publications in the last 50 years.Design/methodology/approachThe present study is applied research that has been conducted using text mining. The statistical population is the coronavirus publications that have been collected from the Web of Science Core Collection (1970–2020). The main keywords were extracted from the Medical Subject Heading browser to design the search strategy. Latent Dirichlet allocation and Python programming language were applied to analyze the data and implement the text mining algorithms of topic modeling.FindingsThe findings indicated that the SARS, science, protein, MERS, veterinary, cell, human, RNA, medicine and virology are the most important keywords in the global coronavirus publications. Also, eight important topics were identified in the global coronavirus publications by implementing the topic modeling algorithm. The highest number of publications were respectively on the following topics: “structure and proteomics,” “Cell signaling and immune response,” “clinical presentation and detection,” “Gene sequence and genomics,” “Diagnosis tests,” “vaccine and immune response and outbreak,” “Epidemiology and Transmission” and “gastrointestinal tissue.”Originality/valueThe originality of this article can be considered in three ways. First, text mining and Latent Dirichlet allocation were applied to analyzing coronavirus literature for the first time. Second, coronavirus is mentioned as a hot topic of research. Finally, in addition to the retrospective approaches to 50 years of data collection and analysis, the results can be exploited with prospective approaches to strategic planning and macro-policymaking.


2020 ◽  
Vol 21 (2) ◽  
pp. 149
Author(s):  
Bagus Wicaksono Arianto ◽  
Gangga Anuraga

PT Ruang Raya Indonesia ("Ruangguru") is the largest and most comprehensive technology company in Indonesia that focuses on education-based services. In 2019 there were 15 million Ruangguru users and 300.00 teachers who had joined and were present in 32 provinces in Indonesia. It prepared a number of expansion strategies to become a company valued at more than US $ 1 billion in the next year or two. The purpose of this research is to classify the opinions of Ruangguru users about the services provided so that it can be an evaluation material in improving their services using the latent direchlet allocation method. The data used comes from a collection of tweets of Twitter users in Indonesia using the Twitter API. The Twitter account used in this study is @ruangguru. The results of the analysis showed that the public perception of Twitter users by using latent dirichlet allocation was formed into 28 topics.Keywords: latent dirichlet allocation, ruangguru, twitter.


2021 ◽  
Vol 10 (3) ◽  
pp. 248-257
Author(s):  
Karel Fauzan Hakim ◽  
Pika Silvianti ◽  
Agus Mohammad Soleh

Covid-19 is a very troubling disease in Indonesia. Therefore, understanding public opinion is required to find solutions and evaluate the government performance in handling the pandemic. Twitter can be helpful to identify the public opinion of significant events. Twitter’s tweet is a large dimension text-based big data. It requires text sampling and text mining to be processed efficiently and effectively. Stratified random sampling with 20 repetitions applied to assume days as strata followed by topic modeling with latent Dirichlet allocation (LDA). This research aims to find out public opinion regarding Covid-19 and itsgrowth over time. Other than that, this research also aims to find out sampling effects on tweet data using stratified random sampling. Therefore, the extracted topics will be transformed into time-series data and considering the variety of the pattern made. Afterward, the transformation results will be explored and interpreted. This research suggests that discussions related to Covid-19 are divided into four topics by the first model, namely: “Vaccine”, “Positive or affected people”, “Health protocol”, and “Indonesia” then nine topics by the second model, namely: “Vaccine”, “Prayer”, “Health protocol”, “Social aid and corruption”, “Affected people”, “Indonesian economy”, “Work”, “Persuading to wear mask”, and “Willing to watch”. Furthermore, some topics peak whenever a significant event occurs in Indonesia. Afterward, this research suggests that 20 repetitions of stratified random sampling could provide good results.


Author(s):  
_______ Naveen ◽  
_____ Priti

The Right to Information Act 2005 was passed by the UPA (United Progressive Alliance) Government with a sense of pride. It flaunted the Act as a milestone in India’s democratic journey. It is five years since the RTI was passed; the performance on the implementation frontis far from perfect. Consequently, the impact on the attitude, mindset and behaviour patterns of the public authorities and the people is not as it was expected to be. Most of the people are still not aware of their newly acquired power. Among those who are aware, a major chunk either does not know how to wield it or lacks the guts and gumption to invoke the RTI. A little more stimulation by the Government, NGOs and other enlightened and empowered citizens can augment the benefits of this Act manifold. RTI will help not only in mitigating corruption in public life but also in alleviating poverty- the two monstrous maladies of India.


Author(s):  
Xi Liu ◽  
Yongfeng Yin ◽  
Haifeng Li ◽  
Jiabin Chen ◽  
Chang Liu ◽  
...  

AbstractExisting software intelligent defect classification approaches do not consider radar characters and prior statistics information. Thus, when applying these appaoraches into radar software testing and validation, the precision rate and recall rate of defect classification are poor and have effect on the reuse effectiveness of software defects. To solve this problem, a new intelligent defect classification approach based on the latent Dirichlet allocation (LDA) topic model is proposed for radar software in this paper. The proposed approach includes the defect text segmentation algorithm based on the dictionary of radar domain, the modified LDA model combining radar software requirement, and the top acquisition and classification approach of radar software defect based on the modified LDA model. The proposed approach is applied on the typical radar software defects to validate the effectiveness and applicability. The application results illustrate that the prediction precison rate and recall rate of the poposed approach are improved up to 15 ~ 20% compared with the other defect classification approaches. Thus, the proposed approach can be applied in the segmentation and classification of radar software defects effectively to improve the identifying adequacy of the defects in radar software.


Sign in / Sign up

Export Citation Format

Share Document