scholarly journals Pemodelan topik pada dokumen paten terkait pupuk di Indonesia berbasis Latent Dirichlet Allocation

2021 ◽  
Vol 17 (2) ◽  
pp. 168-180
Author(s):  
Aris Yaman ◽  
Bagus Sartono ◽  
Agus M. Soleh

Introduction. Fertilizer is one of the most important production factors in the world of agriculture. It is crucial to increase the capacity of technology related to fertilizers. Analysis of patent documents can be one way to analyze technological developments, especially fertilizers. Data Collection Methods. The data used in this research are metadata, especially the title and abstract of a patent document in Indonesia. With the keyword "fertilizer," Patent metadata was processed in the 1945-2017 period. Data Analysis. The LDA model can provide a reasonable interpretation regarding topic modeling based on text data. Results and Discussion. The results find that degree of the patent title is better than the abstract of the patent. The LDA approach can adequately separate the topics of fertilizer patent technology so that it does not have multiple interpretations. Conclusion. Based on the findings, there are nine essential topics in the development of fertilizer technology. There is a phenomenon of the lack of technology collaboration between IPC technology sections.


Author(s):  
Aylin Yılmaz ◽  
M. Atilla Arıcıoğlu ◽  
Nadiye Gülnar

Recently, there was a struggle to control the volume of production and the volume of production between countries and regions in the world. Rather, western countries had a desire to attract investments in the east to their own countries and regions. This desire has led to the emergence of the Industry 4.0 phenomenon of the West, which is Germany. In other words, with this phenomenon, the industry is aimed to digitize production more and contribute to the issues of speed, efficiency and flexibility by providing digitalization in production. With these changes, Industry 4.0, was seen that the system was working better than it was and production was made cheaper than the system, when taking the muscle strength out of the system. While the positive contributions of Industry 4.0 have resonated with all sectors, it has also started to have an impact on the agricultural sector. Problems such as scarcity in the world, not using natural resources effectively and not using technology in the agricultural field, have caused the emergence of digitalization in the agricultural sector. "Agriculture 4.0", wich means making smart production with smart farming practices by using the concepts, information and technologies in the literature. In line with the possibilities and technological developments offered by Industry 4.0, it enables the sensors to be seen in all agricultural machines from the tractor to the crop tools and the communication of the machines in the entire production process by entering the internet of things into the agricultural sector. As a matter of fact, with the agriculture 4.0, the traditional agriculture paradigm has not been sufficient anymore and it contributes to sustainability, to be productive, to protect the rural texture, to protect the environmental quality and to provide accessible food by undergoing changes and agricultural practices. In the study, the problems experienced in the agricultural sector, the effects of Agriculture 4.0 on these problems and how they will benefit are discussed. The use of technology has given the system its name and agriculture has also taken its share in the developments. Accordingly, what are the practices of Agriculture 4.0 in the world and how their contributions are investigated.



2020 ◽  
Vol 12 (16) ◽  
pp. 6673 ◽  
Author(s):  
Kiattipoom Kiatkawsin ◽  
Ian Sutherland ◽  
Jin-Young Kim

Airbnb has emerged as a platform where unique accommodation options can be found. Due to the uniqueness of each accommodation unit and host combination, each listing offers a one-of-a-kind experience. As consumers increasingly rely on text reviews of other customers, managers are also increasingly gaining insight from customer reviews. Thus, this present study aimed to extract those insights from reviews using latent Dirichlet allocation, an unsupervised type of topic modeling that extracts latent discussion topics from text data. Findings of Hong Kong’s 185,695 and Singapore’s 93,571 Airbnb reviews, two long-term rival destinations, were compared. Hong Kong produced 12 total topics that can be categorized into four distinct groups whereas Singapore’s optimal number of topics was only five. Topics produced from both destinations covered the same range of attributes, but Hong Kong’s 12 topics provide a greater degree of precision to formulate managerial recommendations. While many topics are similar to established hotel attributes, topics related to the host and listing management are unique to the Airbnb experience. The findings also revealed keywords used when evaluating the experience that provide more insight beyond typical numeric ratings.



Author(s):  
Carlo Schwarz

In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.



Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Lirong Qiu ◽  
Jia Yu

In the present big data background, how to effectively excavate useful information is the problem that big data is facing now. The purpose of this study is to construct a more effective method of mining interest preferences of users in a particular field in the context of today’s big data. We mainly use a large number of user text data from microblog to study. LDA is an effective method of text mining, but it will not play a very good role in applying LDA directly to a large number of short texts in microblog. In today’s more effective topic modeling project, short texts need to be aggregated into long texts to avoid data sparsity. However, aggregated short texts are mixed with a lot of noise, reducing the accuracy of mining the user’s interest preferences. In this paper, we propose Combining Latent Dirichlet Allocation (CLDA), a new topic model that can learn the potential topics of microblog short texts and long texts simultaneously. The data sparsity of short texts is avoided by aggregating long texts to assist in learning short texts. Short text filtering long text is reused to improve mining accuracy, making long texts and short texts effectively combined. Experimental results in a real microblog data set show that CLDA outperforms many advanced models in mining user interest, and we also confirm that CLDA also has good performance in recommending systems.



2021 ◽  
Vol 10 (1) ◽  
pp. 23-30
Author(s):  
Muhammad Habibi ◽  
Adri Priadana ◽  
Muhammad Rifqi Ma’arif

The World Health Organization (WHO) declared the COVID-19 outbreak has resulted in more than six million confirmed cases and more than 371,000 deaths globally on June 1, 2020. The incident sparked a flood of scientific research to help society deal with the virus, both inside and outside the medical domain. Research related to public health analysis and public conversations about the spread of COVID-19 on social media is one of the highlights of researchers in the world. People can analyze information from social media as supporting data about public health. Analyzing public conversations will help the relevant authorities understand public opinion and information gaps between them and the public, helping them develop appropriate emergency response strategies to address existing problems in the community during the pandemic and provide information on the population's emotions in different contexts. However, research related to the analysis of public health and public conversations was so far conducted only through supervised analysis of textual data. In this study, we aim to analyze specifically the sentiment and topic modeling of Indonesian public conversations about the COVID-19 on Twitter using the NLP technique. We applied some methods to analyze the sentiment to obtain the best classification method. In this study, the topic modeling was carried out unsupervised using Latent Dirichlet Allocation (LDA). The results of this study reveal that the most frequently discussed topic related to the COVID-19 pandemic is economic issues.



Author(s):  
Junaid Rashid ◽  
Syed Muhammad Adnan Shah ◽  
Aun Irtaza

Topic modeling is an effective text mining and information retrieval approach to organizing knowledge with various contents under a specific topic. Text documents in form of news articles are increasing very fast on the web. Analysis of these documents is very important in the fields of text mining and information retrieval. Meaningful information extraction from these documents is a challenging task. One approach for discovering the theme from text documents is topic modeling but this approach still needs a new perspective to improve its performance. In topic modeling, documents have topics and topics are the collection of words. In this paper, we propose a new k-means topic modeling (KTM) approach by using the k-means clustering algorithm. KTM discovers better semantic topics from a collection of documents. Experiments on two real-world Reuters 21578 and BBC News datasets show that KTM performance is better than state-of-the-art topic models like LDA (Latent Dirichlet Allocation) and LSA (Latent Semantic Analysis). The KTM is also applicable for classification and clustering tasks in text mining and achieves higher performance with a comparison of its competitors LDA and LSA.



2021 ◽  
Vol 40 (3) ◽  
Author(s):  
HyunSeung Koh ◽  
Mark Fienup

Library chat services are an increasingly important communication channel to connect patrons to library resources and services. Analysis of chat transcripts could provide librarians with insights into improving services. Unfortunately, chat transcripts consist of unstructured text data, making it impractical for librarians to go beyond simple quantitative analysis (e.g., chat duration, message count, word frequencies) with existing tools. As a stepping-stone toward a more sophisticated chat transcript analysis tool, this study investigated the application of different types of topic modeling techniques to analyze one academic library’s chat reference data collected from April 10, 2015, to May 31, 2019, with the goal of extracting the most accurate and easily interpretable topics. In this study, topic accuracy and interpretability—the quality of topic outcomes—were quantitatively measured with topic coherence metrics. Additionally, qualitative accuracy and interpretability were measured by the librarian author of this paper depending on the subjective judgment on whether topics are aligned with frequently asked questions or easily inferable themes in academic library contexts. This study found that from a human’s qualitative evaluation, Probabilistic Latent Semantic Analysis (pLSA) produced more accurate and interpretable topics, which is not necessarily aligned with the findings of the quantitative evaluation with all three types of topic coherence metrics. Interestingly, the commonly used technique Latent Dirichlet Allocation (LDA) did not necessarily perform better than pLSA. Also, semi-supervised techniques with human-curated anchor words of Correlation Explanation (CorEx) or guided LDA (GuidedLDA) did not necessarily perform better than an unsupervised technique of Dirichlet Multinomial Mixture (DMM). Last, the study found that using the entire transcript, including both sides of the interaction between the library patron and the librarian, performed better than using only the initial question asked by the library patron across different techniques in increasing the quality of topic outcomes.



2020 ◽  
Vol 9 (2) ◽  
pp. 14-35
Author(s):  
Debabrata Sarddar ◽  
Raktim Kumar Dey ◽  
Rajesh Bose ◽  
Sandip Roy

As ubiquitous as it is, the Internet has spawned a slew of products that have forever changed the way one thinks of society and politics. This article proposes a model to predict chances of a political party winning based on data collected from Twitter microblogging website, because it is the most popular microblogging platform in the world. Using unsupervised topic modeling and the NRC Emotion Lexicon, the authors demonstrate how it is possible to predict results by analyzing eight types of emotions expressed by users on Twitter. To prove the results based on empirical analysis, the authors examine the Twitter messages posted during 14th Gujarat Legislative Assembly election, 2017. Implementing two unsupervised clustering methods of K-means and Latent Dirichlet Allocation, this research shows how the proposed model is able to examine and summarize observations based on underlying semantic structures of messages posted on Twitter. These two well-known unsupervised clustering methods provide a firm base for the proposed model to enable streamlining of decision-making processes objectively.



Author(s):  
Doo-San Kim ◽  
Byeong-Cheol Lee ◽  
Kwang-Hi Park

Despite the unique characteristics of urban forests, the motivating factors of urban forest visitors have not been clearly differentiated from other types of the forest resource. This study aims to identify the motivating factors of urban forest visitors, using latent Dirichlet allocation (LDA) topic modeling based on social big data. A total of 57,449 cases of social text data from social blogs containing the keyword “urban forest” were collected from Naver and Daum, the major search engines in South Korea. Then, 17,229 cases were excluded using morpheme analysis and stop word elimination; 40,110 cases were analyzed to identify the motivating factors of urban forest visitors through LDA topic modeling. Seven motivating factors—“Cafe-related Walk”, “Healing Trip”, “Daily Leisure”, “Family Trip”, “Wonderful View”, “Clean Space”, and “Exhibition and Photography”—were extracted; each contained five keywords. This study elucidates the role of forests as a place for healing, leisure, and daily exercise. The results suggest that efforts should be made toward developing various programs regarding the basic functionality of urban forests as a natural resource and a unique place to support a diversity of leisure and cultural activities.



GIS Business ◽  
2019 ◽  
Vol 14 (6) ◽  
pp. 597-606
Author(s):  
Dr. Maha Mustafa Omer Abdalaziz

The study aims at the technological developments that are taking place in the world and have impacted on all sectors and fields and imposed on the business organizations and commercial companies to carry out their marketing and promotional activities within the electronic environment. The most prominent of these developments is the emergence of the concept of electronic advertising which opened a wide range of companies and businessmen to advertise And to promote their products and their work easily through the Internet, which has become full of electronic advertising, and in light of that will discuss the creative strategy used in electronic advertising;



Sign in / Sign up

Export Citation Format

Share Document