What is the Conversation About?

2012 ◽  
Vol 8 (1) ◽  
pp. 10-25 ◽  
Author(s):  
Stefan Sommer ◽  
Andreas Schieber ◽  
Kai Heinrich ◽  
Andreas Hilbert

In Social Commerce customers evolve to be an important information source for companies. Customers use the communication platforms of Web 2.0, for example Twitter, in order to express their sentiments about products or discuss their experiences with them. These sentiments can be very important for the development of products or the enhancement of marketing strategies. The research goal is to analyze customer sentiments in Twitter. The first step in the research is the detection of topics in Twitter entries which contain patterns of interest. For the topic detection, the authors use Latent Dirichlet Allocation for topic modeling. The authors found event based topics in the exemplary context of Sony’s 3D TV sets. In future work, the authors will implement sentiment analysis algorithms in order to determine sentiments in the entries corresponding to the detected topics.

AI ◽  
2021 ◽  
Vol 2 (4) ◽  
pp. 578-599
Author(s):  
Fuad Alattar ◽  
Khaled Shaalan

Comparing two sets of documents to identify new topics is useful in many applications, like discovering trending topics from sets of scientific papers, emerging topic detection in microblogs, and interpreting sentiment variations in Twitter. In this paper, the main topic-modeling-based approaches to address this task are examined to identify limitations and necessary enhancements. To overcome these limitations, we introduce two separate frameworks to discover emerging topics through a filtered latent Dirichlet allocation (filtered-LDA) model. The model acts as a filter that identifies old topics from a timestamped set of documents, removes all documents that focus on old topics, and keeps documents that discuss new topics. Filtered-LDA also genuinely reduces the chance of using keywords from old topics to represent emerging topics. The final stage of the filter uses multiple topic visualization formats to improve human interpretability of the filtered topics, and it presents the most-representative document for each topic.


2021 ◽  
Author(s):  
Jin-Ah Sim ◽  
Soowon Park

BACKGROUND Online inquiry platforms, which is where a person can anonymously ask questions, have become an important information source for those who are concerned about social stigma and discrimination that follow mental disorders. Therefore, examining what people inquire about regarding mental disorders would be useful when designing educational programs for communities. OBJECTIVE The present study aimed to examine the contents of the queries regarding mental disorders that were posted on online inquiry platforms. METHODS A total of 4,714 relevant queries from the two major online inquiry platforms were collected. We computed word frequencies, centralities, and latent Dirichlet allocation (LDA) topic modeling. RESULTS The words like symptom, hospital and treatment ranked as the most frequently used words, and the word my appeared to have the highest centrality. Results: Four topics exist according to the LDA, which are 1) understanding general symptoms, 2) disability grading system and welfare entitlement, 3) stressful life events, and (4) social adaptation with mental disorders. CONCLUSIONS People are interested in practical information concerning mental disorders, such as social benefits, social adaptation, and more general information about the symptoms and the treatments. Our findings suggest that instructions encompassing different scopes of information are needed when developing educational programs.


2021 ◽  
Vol 9 ◽  
Author(s):  
Soowon Park ◽  
Yaeji Kim-Knauss ◽  
Jin-ah Sim

Online inquiry platforms, which is where a person can anonymously ask questions, have become an important information source for those who are concerned about social stigma and discrimination that follow mental disorders. Therefore, examining what people inquire about regarding mental disorders would be useful when designing educational programs for communities. The present study aimed to examine the contents of the queries regarding mental disorders that were posted on online inquiry platforms. A total of 4,714 relevant queries from the two major online inquiry platforms were collected. We computed word frequencies, centralities, and latent Dirichlet allocation (LDA) topic modeling. The words like symptom, hospital and treatment ranked as the most frequently used words, and the word my appeared to have the highest centrality. LDA identified four latent topics: (1) the understanding of general symptoms, (2) a disability grading system and welfare entitlement, (3) stressful life events, and (4) social adaptation with mental disorders. People are interested in practical information concerning mental disorders, such as social benefits, social adaptation, more general information about the symptoms and the treatments. Our findings suggest that instructions encompassing different scopes of information are needed when developing educational programs.


Information ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 401
Author(s):  
Girma Neshir ◽  
Andreas Rauber ◽  
Solomon Atnafu

Topic Modeling is a statistical process, which derives the latent themes from extensive collections of text. Three approaches to topic modeling exist, namely, unsupervised, semi-supervised and supervised. In this work, we develop a supervised topic model for an Amharic corpus. We also investigate the effect of stemming on topic detection on Term Frequency Inverse Document Frequency (TF-IDF) features, Latent Dirichlet Allocation (LDA) features and a combination of these two feature sets using four supervised machine learning tools, that is, Support Vector Machine (SVM), Naive Bayesian (NB), Logistic Regression (LR), and Neural Nets (NN). We evaluate our approach using an Amharic corpus of 14,751 documents of ten topic categories. Both qualitative and quantitative analysis of results show that our proposed supervised topic detection outperforms with an accuracy of 88% by SVM using state-of-the-art-approach TF-IDF word features with the application of the Synthetic Minority Over-sampling Technique (SMOTE) and with no stemming operation. The results show that text features with stemming slightly improve the performance of the topic classifier over features with no stemming.


2021 ◽  
pp. 1-16
Author(s):  
Ibtissem Gasmi ◽  
Mohamed Walid Azizi ◽  
Hassina Seridi-Bouchelaghem ◽  
Nabiha Azizi ◽  
Samir Brahim Belhaouari

Context-Aware Recommender System (CARS) suggests more relevant services by adapting them to the user’s specific context situation. Nevertheless, the use of many contextual factors can increase data sparsity while few context parameters fail to introduce the contextual effects in recommendations. Moreover, several CARSs are based on similarity algorithms, such as cosine and Pearson correlation coefficients. These methods are not very effective in the sparse datasets. This paper presents a context-aware model to integrate contextual factors into prediction process when there are insufficient co-rated items. The proposed algorithm uses Latent Dirichlet Allocation (LDA) to learn the latent interests of users from the textual descriptions of items. Then, it integrates both the explicit contextual factors and their degree of importance in the prediction process by introducing a weighting function. Indeed, the PSO algorithm is employed to learn and optimize weights of these features. The results on the Movielens 1 M dataset show that the proposed model can achieve an F-measure of 45.51% with precision as 68.64%. Furthermore, the enhancement in MAE and RMSE can respectively reach 41.63% and 39.69% compared with the state-of-the-art techniques.


2021 ◽  
Vol 16 (4) ◽  
pp. 1042-1065
Author(s):  
Anne Gottfried ◽  
Caroline Hartmann ◽  
Donald Yates

The business intelligence (BI) market has grown at a tremendous rate in the past decade due to technological advancements, big data and the availability of open source content. Despite this growth, the use of open government data (OGD) as a source of information is very limited among the private sector due to a lack of knowledge as to its benefits. Scant evidence on the use of OGD by private organizations suggests that it can lead to the creation of innovative ideas as well as assist in making better informed decisions. Given the benefits but lack of use of OGD to generate business intelligence, we extend research in this area by exploring how OGD can be used to generate business intelligence for the identification of market opportunities and strategy formulation; an area of research that is still in its infancy. Using a two-industry case study approach (footwear and lumber), we use latent Dirichlet allocation (LDA) topic modeling to extract emerging topics in these two industries from OGD, and a data visualization tool (pyLDAVis) to visualize the topics in order to interpret and transform the data into business intelligence. Additionally, we perform an environmental scanning of the environment for the two industries to validate the usability of the information obtained. The results provide evidence that OGD can be a valuable source of information for generating business intelligence and demonstrate how topic modeling and visualization tools can assist organizations in extracting and analyzing information for the identification of market opportunities.


2021 ◽  
Vol 13 (5) ◽  
pp. 2876
Author(s):  
Anne Parlina ◽  
Kalamullah Ramli ◽  
Hendri Murfi

The literature discussing the concepts, technologies, and ICT-based urban innovation approaches of smart cities has been growing, along with initiatives from cities all over the world that are competing to improve their services and become smart and sustainable. However, current studies that provide a comprehensive understanding and reveal smart and sustainable city research trends and characteristics are still lacking. Meanwhile, policymakers and practitioners alike need to pursue progressive development. In response to this shortcoming, this research offers content analysis studies based on topic modeling approaches to capture the evolution and characteristics of topics in the scientific literature on smart and sustainable city research. More importantly, a novel topic-detecting algorithm based on the deep learning and clustering techniques, namely deep autoencoders-based fuzzy C-means (DFCM), is introduced for analyzing the research topic trend. The topics generated by this proposed algorithm have relatively higher coherence values than those generated by previously used topic detection methods, namely non-negative matrix factorization (NMF), latent Dirichlet allocation (LDA), and eigenspace-based fuzzy C-means (EFCM). The 30 main topics that appeared in topic modeling with the DFCM algorithm were classified into six groups (technology, energy, environment, transportation, e-governance, and human capital and welfare) that characterize the six dimensions of smart, sustainable city research.


2021 ◽  
Author(s):  
Faizah Faizah ◽  
Bor-Shen Lin

BACKGROUND The World Health Organization (WHO) declared COVID-19 as a global pandemic on January 30, 2020. However, the pandemic has not been over yet. Furthermore, in the first quartal of 2021, some countries face the third wave of the pandemic. During the difficult time, the development of the vaccines for COVID-19 accelerates rapidly. Understanding the public perception of the COVID-19 Vaccine according to the data collected from social media can widen the perspective on the state of the global pandemic OBJECTIVE This study explores and analyzes the latent topic on COVID-19 Vaccine Tweet posted by individuals from various countries by using two-stage topic modeling. METHODS A two-stage analysis in topic modeling was proposed to investigating people’s reactions in five countries. The first stage is Latent Dirichlet Allocation that produces the latent topics with the corresponding term distributions that facilitate the investigators to understand the main issues or opinions. The second stage then performs agglomerative clustering on the latent topics based on Hellinger distance, which merges close topics hierarchically into topic clusters to visualize those topics in either tree or graph views. RESULTS In general, the topic discussion regarding the COVID-19 Vaccine in five countries is similar. Topic themes such as "first vaccine" and & "vaccine effect" dominate the public discussion. The remarkable point is that people in some countries have some topic themes, such as "politician opinion" and " stay home" in Canada, "emergency" in India, and & "blood clots" in the United Kingdom. The analysis also shows the most popular COVID-19 Vaccine, which is gaining more public interest. CONCLUSIONS With LDA and Hierarchical clustering, two-stage topic modeling is powerful for visualizing the latent topics and understanding the public perception regarding the COVID-19 Vaccine.


2021 ◽  
Author(s):  
Shimon Ohtani

Abstract The importance of biodiversity conservation is gradually being recognized worldwide, and 2020 was the final year of the Aichi Biodiversity Targets formulated at the 10th Conference of the Parties to the Convention on Biological Diversity (COP10) in 2010. Unfortunately, the majority of the targets were assessed as unachievable. While it is essential to measure public awareness of biodiversity when setting the post-2020 targets, it is also a difficult task to propose a method to do so. This study provides a diachronic exploration of the discourse on “biodiversity” from 2010 to 2020, using Twitter posts, in combination with sentiment analysis and topic modeling, which are commonly used in data science. Through the aggregation and comparison of n-grams, the visualization of eight types of emotional tendencies using the NRC emotion lexicon, the construction of topic models using Latent Dirichlet allocation (LDA), and the qualitative analysis of tweet texts based on these models, I was able to classify and analyze unstructured tweets in a meaningful way. The results revealed the evolution of words used with “biodiversity” on Twitter over the past decade, the emotional tendencies behind the contexts in which “biodiversity” has been used, and the approximate content of tweet texts that have constituted topics with distinctive characteristics. While the search for people's awareness through SNS analysis still has many limitations, it is undeniable that important suggestions can be obtained. In order to further refine the research method, it will be essential to improve the skills of analysts and accumulate research examples as well as to advance data science.


Sign in / Sign up

Export Citation Format

Share Document