Modeling the public attitude towards organic foods: a big data and text mining approach

AbstractThis study aims to identify the topics that users post on Twitter about organic foods and to analyze the emotion-based sentiment of those tweets. The study addresses a call for an application of big data and text mining in different fields of research, as well as proposes more objective research methods in studies on food consumption. There is a growing interest in understanding consumer choices for foods which are caused by the predominant contribution of the food industry to climate change. So far, customer attitudes towards organic food have been studied mostly with self-reported methods, such as questionnaires and interviews, which have many limitations. Therefore, in the present study, we used big data and text mining techniques as more objective methods to analyze the public attitude about organic foods. A total of 43,724 Twitter posts were extracted with streaming Application Programming Interface (API). Latent Dirichlet Allocation (LDA) algorithm was applied for topic modeling. A test of topic significance was performed to evaluate the quality of the topics. Public sentiment was analyzed based on the NRC emotion lexicon by utilizing Syuzhet package. Topic modeling results showed that people discuss on variety of themes related to organic foods such as plant-based diet, saving the planet, organic farming and standardization, authenticity, and food delivery, etc. Sentiment analysis results suggest that people view organic foods positively, though there are also people who are skeptical about the claims that organic foods are natural and free from chemicals and pesticides. The study contributes to the field of consumer behavior by implementing research methods grounded in text mining and big data. The study contributes also to the advancement of research in the field of sustainable food consumption by providing a fresh perspective on public attitude toward organic foods, filling the gaps in existing literature and research.

Download Full-text

A Study on Ways to Improve Mobile RPG Using Big Data Text Mining

10.20944/preprints202105.0601.v1 ◽

2021 ◽

Author(s):

DongHyun Youm ◽

JungYoon Kim

Keyword(s):

Big Data ◽

Text Mining ◽

Mass Production ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Meaningful Information ◽

Mining Technique ◽

Production Type ◽

Google Play ◽

Play Experience

As RPG has high sales and profits, lots of developers have supplied various RPG to market but it changed to mass production type with sensational advertising, low quality and excessive charging and similar contents which affects game market and users’ game play experience. The author of this paper studied ways to improve mobile RPG by collecting and analyzing users’ reviews using crawling on Google Play Store. The author of this paper used topic modeling that uses text mining technique and LDA (Latent Dirichlet Allocation) to extract meaningful information from collected big data and visualized it. Inferring users’ reviews, figuring out opinions objectively and seeking ways to improve games are helpful in improving mobile RPG that can be played continuously.

Download Full-text

Analysis and Visualization Latent Topic on COVID-19 Vaccine Tweet use two-stage topic modeling (Preprint)

10.2196/preprints.30290 ◽

2021 ◽

Author(s):

Faizah Faizah ◽

Bor-Shen Lin

Keyword(s):

Topic Modeling ◽

Public Perception ◽

Latent Dirichlet Allocation ◽

World Health ◽

Two Stage ◽

The Public ◽

Global Pandemic ◽

Difficult Time ◽

Latent Topic ◽

Latent Topics

BACKGROUND The World Health Organization (WHO) declared COVID-19 as a global pandemic on January 30, 2020. However, the pandemic has not been over yet. Furthermore, in the first quartal of 2021, some countries face the third wave of the pandemic. During the difficult time, the development of the vaccines for COVID-19 accelerates rapidly. Understanding the public perception of the COVID-19 Vaccine according to the data collected from social media can widen the perspective on the state of the global pandemic OBJECTIVE This study explores and analyzes the latent topic on COVID-19 Vaccine Tweet posted by individuals from various countries by using two-stage topic modeling. METHODS A two-stage analysis in topic modeling was proposed to investigating people’s reactions in five countries. The first stage is Latent Dirichlet Allocation that produces the latent topics with the corresponding term distributions that facilitate the investigators to understand the main issues or opinions. The second stage then performs agglomerative clustering on the latent topics based on Hellinger distance, which merges close topics hierarchically into topic clusters to visualize those topics in either tree or graph views. RESULTS In general, the topic discussion regarding the COVID-19 Vaccine in five countries is similar. Topic themes such as "first vaccine" and & "vaccine effect" dominate the public discussion. The remarkable point is that people in some countries have some topic themes, such as "politician opinion" and " stay home" in Canada, "emergency" in India, and & "blood clots" in the United Kingdom. The analysis also shows the most popular COVID-19 Vaccine, which is gaining more public interest. CONCLUSIONS With LDA and Hierarchical clustering, two-stage topic modeling is powerful for visualizing the latent topics and understanding the public perception regarding the COVID-19 Vaccine.

Download Full-text

SUPPORTING AIR TRANSPORT POLICIES USING BIG DATA ANALYTICS: A DESCRIPTIVE APPROACH BASED EMERGING TREND ANALYSIS

Journal of Air Transport Studies ◽

10.38008/jats.v8i1.40 ◽

2017 ◽

Vol 8 (1) ◽

pp. 51-72

Author(s):

Jin-seo Park

Keyword(s):

Qualitative Research ◽

Big Data ◽

Text Mining ◽

Research Methods ◽

Qualitative Research Methods ◽

Research Papers ◽

Emerging Trends ◽

The Future ◽

Descriptive Approach ◽

Core Issues

Qualitative research methods based on literature review or expert judgement have been used to find core issues, analyze emerging trends and discover promising areas for the future. Deriving results from large amounts of information under this approach is both costly and time consuming. Besides, there is a risk that the results may be influenced by the subjective opinion of experts. In order to make up for such weaknesses, the analysis paradigm for choosing future emerging trend is undergoing a shift toward mplementing qualitative research methods along with quantitative research methods like text mining in a mutually complementary manner. The hange used to implement recent studies is being witnessed in various areas such as the steel industry, the information and communications technology industry, the construction industry in architectural engineering and so on. This study focused on retrieving aviation-related core issues and the promising areas for the future from research papers pertaining to overall aviation areas through text mining method, which is one of the big data analysis techniques. This study has limitations in that its analysis for retrieving the aviation-related core issues and promising fields was restricted to research papers containing the keyword "aviation." However, it has significance in that it prepared a quantitative analysis model for continuously monitoring the derived core issues and emerging trends regarding the promising areas for the future in the aviation industry through the application of a big data-based descriptive approach.

Download Full-text

Understanding Consumers" Perceptions of the Fresh-Food Delivery Platform Service Based on Big Data: Using Text Mining and Semantic Network Analysis

Korean Journal of Hospitality & Tourism ◽

10.24992/kjht.2021.2.30.02.37 ◽

2021 ◽

Vol 30 (2) ◽

pp. 37-52

Author(s):

Jee-Won Kang ◽

Young Namkung

Keyword(s):

Big Data ◽

Network Analysis ◽

Text Mining ◽

Semantic Network ◽

Food Delivery ◽

Semantic Network Analysis ◽

Fresh Food ◽

Delivery Platform

Download Full-text

Research trends on Big Data in Marketing: A text mining and topic modeling based literature analysis

European Research on Management and Business Economics ◽

10.1016/j.iedeen.2017.06.002 ◽

2018 ◽

Vol 24 (1) ◽

pp. 1-7 ◽

Cited By ~ 69

Author(s):

Alexandra Amado ◽

Paulo Cortez ◽

Paulo Rita ◽

Sérgio Moro

Keyword(s):

Big Data ◽

Text Mining ◽

Topic Modeling ◽

Research Trends ◽

Literature Analysis

Download Full-text

CLDA: An Effective Topic Model for Mining User Interest Preference under Big Data Background

Complexity ◽

10.1155/2018/2503816 ◽

2018 ◽

Vol 2018 ◽

pp. 1-10 ◽

Cited By ~ 5

Author(s):

Lirong Qiu ◽

Jia Yu

Keyword(s):

Big Data ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

User Interest ◽

Text Data ◽

Data Set ◽

Data Sparsity ◽

Short Text ◽

Text Filtering

In the present big data background, how to effectively excavate useful information is the problem that big data is facing now. The purpose of this study is to construct a more effective method of mining interest preferences of users in a particular field in the context of today’s big data. We mainly use a large number of user text data from microblog to study. LDA is an effective method of text mining, but it will not play a very good role in applying LDA directly to a large number of short texts in microblog. In today’s more effective topic modeling project, short texts need to be aggregated into long texts to avoid data sparsity. However, aggregated short texts are mixed with a lot of noise, reducing the accuracy of mining the user’s interest preferences. In this paper, we propose Combining Latent Dirichlet Allocation (CLDA), a new topic model that can learn the potential topics of microblog short texts and long texts simultaneously. The data sparsity of short texts is avoided by aggregating long texts to assist in learning short texts. Short text filtering long text is reused to improve mining accuracy, making long texts and short texts effectively combined. Experimental results in a real microblog data set show that CLDA outperforms many advanced models in mining user interest, and we also confirm that CLDA also has good performance in recommending systems.

Download Full-text

Sentiment Analysis and Topic Modeling of Indonesian Public Conversation about COVID-19 Epidemics on Twitter

IJID (International Journal on Informatics for Development) ◽

10.14421/ijid.2021.2400 ◽

2021 ◽

Vol 10 (1) ◽

pp. 23-30

Author(s):

Muhammad Habibi ◽

Adri Priadana ◽

Muhammad Rifqi Ma’arif

Keyword(s):

Public Health ◽

Social Media ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

World Health ◽

The Public ◽

Response Strategies ◽

Existing Problems ◽

The World ◽

Health Organization

The World Health Organization (WHO) declared the COVID-19 outbreak has resulted in more than six million confirmed cases and more than 371,000 deaths globally on June 1, 2020. The incident sparked a flood of scientific research to help society deal with the virus, both inside and outside the medical domain. Research related to public health analysis and public conversations about the spread of COVID-19 on social media is one of the highlights of researchers in the world. People can analyze information from social media as supporting data about public health. Analyzing public conversations will help the relevant authorities understand public opinion and information gaps between them and the public, helping them develop appropriate emergency response strategies to address existing problems in the community during the pandemic and provide information on the population's emotions in different contexts. However, research related to the analysis of public health and public conversations was so far conducted only through supervised analysis of textual data. In this study, we aim to analyze specifically the sentiment and topic modeling of Indonesian public conversations about the COVID-19 on Twitter using the NLP technique. We applied some methods to analyze the sentiment to obtain the best classification method. In this study, the topic modeling was carried out unsupervised using Latent Dirichlet Allocation (LDA). The results of this study reveal that the most frequently discussed topic related to the COVID-19 pandemic is economic issues.

Download Full-text

Iranian COVID-19 Publications in LitCovid: Text Mining and Topic Modeling

Scientific Programming ◽

10.1155/2021/3315695 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Meisam Dastani ◽

Farshid Danesh

Keyword(s):

Case Report ◽

Text Mining ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Subject Area ◽

Scientific Publications ◽

Statistical Population ◽

Strategic Issues ◽

Number Of Publications ◽

And Control

COVID-19 is a threat to the lives of people all over the world. As a result of the new and unknown nature of COVID-19, much research has been conducted recently. In order to increase and enhance the growth rate of Iranian publications on COVID-19, this article aims to analyze these publications in LitCovid to identify the topical and content structure and topic modeling of scientific publications in the mentioned subject area. The present article is applied research performed by using an analytical approach as well as text mining techniques. The statistical population is all the publications of Iranian researchers in LitCovid. Latent Dirichlet Allocation (LDA) and Python were used to analyze the data and implement text mining and topic modeling algorithms. Data analysis shows that the percentage of Iranian publications in the eight topical groups in LitCovid is as follows: prevention (39.57%), treatment (18.99%), diagnosis (18.99%), forecasting (7.83%), case report (6.52%), mechanism (3.91%), transmission (3.62%), and general (0.58%). The results indicate that patient, pandemic, outbreak, case, Iranian, model, care, health, coronavirus, and disease are the most important words in the publications of Iranian researchers in LitCovid. Six topics for prevention; four topics for treatment and case report and forecasting; three topics for diagnosis, mechanism, and transmission in general have been obtained by implementing the topic modeling algorithm. Most of the Iranian publications in LitCovid are related to the topic “pandemic status,” with 22.47% in the prevention category, and the lowest number of publications is related to the topic “environment,” with 11.11% in the transmission category. The present study indicates a better understanding of essential and strategic issues of Iranian publications in LitCovid. The results reveal that many Iranian studies on COVID-19 were primarily on the issues related to prevention, management, and control. These findings provided a structured and research-based viewpoint of COVID-19 in Iran to guide researchers and policymakers.

Download Full-text

An Efficient Topic Modeling Approach for Text Mining and Information Retrieval through K-means Clustering

Mehran University Research Journal of Engineering and Technology ◽

10.22581/muet1982.2001.20 ◽

2020 ◽

Vol 39 (1) ◽

pp. 213-222

Author(s):

Junaid Rashid ◽

Syed Muhammad Adnan Shah ◽

Aun Irtaza

Keyword(s):

Information Retrieval ◽

Text Mining ◽

Topic Modeling ◽

Clustering Algorithm ◽

Latent Dirichlet Allocation ◽

Semantic Analysis ◽

State Of The Art ◽

Text Documents ◽

New Perspective ◽

Better Than

Topic modeling is an effective text mining and information retrieval approach to organizing knowledge with various contents under a specific topic. Text documents in form of news articles are increasing very fast on the web. Analysis of these documents is very important in the fields of text mining and information retrieval. Meaningful information extraction from these documents is a challenging task. One approach for discovering the theme from text documents is topic modeling but this approach still needs a new perspective to improve its performance. In topic modeling, documents have topics and topics are the collection of words. In this paper, we propose a new k-means topic modeling (KTM) approach by using the k-means clustering algorithm. KTM discovers better semantic topics from a collection of documents. Experiments on two real-world Reuters 21578 and BBC News datasets show that KTM performance is better than state-of-the-art topic models like LDA (Latent Dirichlet Allocation) and LSA (Latent Semantic Analysis). The KTM is also applicable for classification and clustering tasks in text mining and achieves higher performance with a comparison of its competitors LDA and LSA.

Download Full-text

What’s yours is mine: exploring customer voice on Airbnb using text-mining approaches

Journal of Consumer Marketing ◽

10.1108/jcm-02-2018-2581 ◽

2019 ◽

Vol 36 (5) ◽

pp. 655-665 ◽

Cited By ~ 10

Author(s):

Jurui Zhang

Keyword(s):

Content Analysis ◽

Text Mining ◽

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Local Community ◽

Negative Emotion ◽

Online Reviews ◽

Sharing Economy ◽

Consumer Reviews ◽

Content Type

Purpose This paper aims to investigate customers’ experiences with Airbnb by text-mining customer reviews posted on the platform and comparing the extracted topics from online reviews between Airbnb and the traditional hotel industry using topic modeling. Design/methodology/approach This research uses text-mining approaches, including content analysis and topic modeling (latent Dirichlet allocation method), to examine 1,026,988 Airbnb guest reviews of 50,933 listings in seven cities in the USA. Findings The content analysis shows that negative reviews are more authentic and credible than positive reviews on Airbnb and that the occurrence of social words is positively related to positive emotion in reviews, but negatively related to negative emotion in reviews. A comparison of reviews on Airbnb and hotel reviews shows unique topics on Airbnb, namely, “late check-in”, “patio and deck view”, “food in kitchen”, “help from host”, “door lock/key”, “sleep/bed condition” and “host response”. Research limitations/implications The topic modeling result suggests that Airbnb guests want to get to know and connect with the local community; thus, help from hosts on ways they can authentically experience the local community would be beneficial. In addition, the results suggest that customers emphasize their interaction with hosts; thus, to improve customer satisfaction, Airbnb hosts should interact with guests and respond to guests’ inquiries quickly. Practical implications Hotel managers should design marketing programs that fulfill customers’ desire for authentic and local experiences. The results also suggest that peer-to-peer accommodation platforms should improve online review systems to facilitate authentic reviews and help guests have a smooth check-in process. Originality/value This study is one of the first to examine consumer reviews in detail in the sharing economy and compare topics from consumer reviews between Airbnb and hotels.

Download Full-text