topic discovery Latest Research Papers

Big Data creates many challenges for data mining experts, in particular in getting meanings of text data. It is beneficial for text mining to build a bridge between word embedding process and graph capacity to connect the dots and represent complex correlations between entities. In this study we examine processes of building a semantic graph model to determine word associations and discover document topics. We introduce a novel Word2Vec2Graph model that is built on top of Word2Vec word embedding model. We demonstrate how this model can be used to analyze long documents, get unexpected word associations and uncover document topics. To validate topic discovery method we transfer words to vectors and vectors to images and use CNN deep learning image classification.

Download Full-text

Detect Text Topics by Semantics Graphs

10.5121/csit.2021.110806 ◽

2021 ◽

Author(s):

Alex Romanova

Keyword(s):

Deep Learning ◽

Image Classification ◽

Graph Model ◽

Word Embedding ◽

Topic Analysis ◽

Topic Discovery ◽

Model Finding ◽

As Graph ◽

Discovery Method ◽

Graph Capacity

It is beneficial for document topic analysis to build a bridge between word embedding process and graph capacity to connect the dots and represent complex correlations between entities. In this study we examine processes of building a semantic graph model, finding document topics and validating topic discovery. We introduce a novel Word2Vec2Graph model that is built on top of Word2Vec word embedding model. We demonstrate how this model can be used to analyze long documents and uncover document topics as graph clusters. To validate topic discovery method we transfer words to vectors and vectors to images and use deep learning image classification.

Download Full-text

Public opinion hot topic discovery based on improved bacterial foraging algorithm

2021 IEEE International Conference on Artificial Intelligence and Industrial Design (AIID) ◽

10.1109/aiid51893.2021.9456485 ◽

2021 ◽

Author(s):

Zhang Yipeng ◽

Gao Xiang ◽

Paul- Mengvi Gatpandan ◽

Maryli F. Rosas ◽

Daniel Dasig

Keyword(s):

Public Opinion ◽

Topic Discovery ◽

Bacterial Foraging ◽

Bacterial Foraging Algorithm

Download Full-text

Topic Discovery on Farsi, English, French, and Arabic Tweets Related to COVID-19 Using Text Mining Techniques

Navigating Healthcare Through Challenging Times - Studies in Health Technology and Informatics ◽

10.3233/shti210084 ◽

2021 ◽

Author(s):

Hamoon Jafarian ◽

Mahin Mohammadi ◽

Alireza Javaheri ◽

Makram Sukarieh ◽

Mohsen Yoosefi Nejad ◽

...

Keyword(s):

Public Health ◽

Social Networks ◽

Middle East ◽

Text Mining ◽

North Africa ◽

Common Theme ◽

Good Source ◽

Topic Discovery ◽

Global Threat

Background: Social networks are a good source for monitoring public health during the outbreak of COVID-19, these networks play an important role in identifying useful information. Objectives: This study aims to draw a comparison of the public’s reaction in Twitter among the countries of West Asia (a.k.a Middle East) and North Africa in order to make an understanding of their response regarding the same global threat. Methods: 766,630 tweets in four languages (Arabic, English French, and Farsi) tweeted in March 2020, were investigated. Results: The results indicate that the only common theme among all languages is “government responsibilities (political)” which indicates the importance of this subject for all nations. Conclusion: Although nations react similarly in some aspects, they respond differently in others and therefore, policy localization is a vital step in confronting problems such as COVID-19 pandemic.

Download Full-text

SenU-PTM: a novel phrase-based topic model for short-text topic discovery by exploiting word embeddings

Data Technologies and Applications ◽

10.1108/dta-02-2021-0039 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Heng-Yang Lu ◽

Yi Zhang ◽

Yuntao Du

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

Second Phase ◽

Word Embeddings ◽

Two Phase ◽

Content Type ◽

Short Text ◽

Topic Discovery ◽

Two Phases ◽

Sense Unit

PurposeTopic model has been widely applied to discover important information from a vast amount of unstructured data. Traditional long-text topic models such as Latent Dirichlet Allocation may suffer from the sparsity problem when dealing with short texts, which mostly come from the Web. These models also exist the readability problem when displaying the discovered topics. The purpose of this paper is to propose a novel model called the Sense Unit based Phrase Topic Model (SenU-PTM) for both the sparsity and readability problems.Design/methodology/approachSenU-PTM is a novel phrase-based short-text topic model under a two-phase framework. The first phase introduces a phrase-generation algorithm by exploiting word embeddings, which aims to generate phrases with the original corpus. The second phase introduces a new concept of sense unit, which consists of a set of semantically similar tokens for modeling topics with token vectors generated in the first phase. Finally, SenU-PTM infers topics based on the above two phases.FindingsExperimental results on two real-world and publicly available datasets show the effectiveness of SenU-PTM from the perspectives of topical quality and document characterization. It reveals that modeling topics on sense units can solve the sparsity of short texts and improve the readability of topics at the same time.Originality/valueThe originality of SenU-PTM lies in the new procedure of modeling topics on the proposed sense units with word embeddings for short-text topic discovery.

Download Full-text

Classification aware neural topic model for COVID-19 disinformation categorisation

PLoS ONE ◽

10.1371/journal.pone.0247086 ◽

2021 ◽

Vol 16 (2) ◽

pp. e0247086

Author(s):

Xingyi Song ◽

Johann Petrak ◽

Ye Jiang ◽

Iknoor Singh ◽

Diana Maynard ◽

...

Keyword(s):

Public Health ◽

Topic Model ◽

Medical Science ◽

Policy Makers ◽

Media Type ◽

Topic Discovery ◽

Extensive Analysis ◽

Public Health Messages ◽

Fact Checking ◽

Effective Public Health

The explosion of disinformation accompanying the COVID-19 pandemic has overloaded fact-checkers and media worldwide, and brought a new major challenge to government responses worldwide. Not only is disinformation creating confusion about medical science amongst citizens, but it is also amplifying distrust in policy makers and governments. To help tackle this, we developed computational methods to categorise COVID-19 disinformation. The COVID-19 disinformation categories could be used for a) focusing fact-checking efforts on the most damaging kinds of COVID-19 disinformation; b) guiding policy makers who are trying to deliver effective public health messages and counter effectively COVID-19 disinformation. This paper presents: 1) a corpus containing what is currently the largest available set of manually annotated COVID-19 disinformation categories; 2) a classification-aware neural topic model (CANTM) designed for COVID-19 disinformation category classification and topic discovery; 3) an extensive analysis of COVID-19 disinformation categories with respect to time, volume, false type, media type and origin source.

Download Full-text

Indonesia Infrastructure Development Topic Discovery on Online News with Latent Dirichlet Allocation

IOP Conference Series Materials Science and Engineering ◽

10.1088/1757-899x/1077/1/012012 ◽

2021 ◽

Vol 1077 (1) ◽

pp. 012012

Author(s):

Ahmad Fathan Hidayatullah ◽

Muhammad Rifqi Ma’arif ◽

Muhammad Habibie ◽

Siti Khomsah

Keyword(s):

Latent Dirichlet Allocation ◽

Online News ◽

Infrastructure Development ◽

Topic Discovery ◽

Dirichlet Allocation

Download Full-text

Improving literature exploration in the clinical decision-making process: Interactive classification and topic discovery on diabetes-related biomedical literature (Preprint)

Journal of Medical Internet Research ◽

10.2196/27434 ◽

2021 ◽

Author(s):

Adrian Ahne ◽

Guy Fagherazzi ◽

Xavier Tannier ◽

Thomas Czernichow ◽

Francisco Orchard

Keyword(s):

Decision Making ◽

Clinical Decision Making ◽

Clinical Decision ◽

Biomedical Literature ◽

Decision Making Process ◽

Topic Discovery

Download Full-text

topic discovery
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Hot Topic Discovery across Social Networks Based on Improved LDA Model

Unsupervised Topic Discovery in User Comments

Semantics Graph Mining for Topic Discovery and Word Associations

Detect Text Topics by Semantics Graphs

Public opinion hot topic discovery based on improved bacterial foraging algorithm

Topic Discovery on Farsi, English, French, and Arabic Tweets Related to COVID-19 Using Text Mining Techniques

SenU-PTM: a novel phrase-based topic model for short-text topic discovery by exploiting word embeddings

Classification aware neural topic model for COVID-19 disinformation categorisation

Indonesia Infrastructure Development Topic Discovery on Online News with Latent Dirichlet Allocation

Improving literature exploration in the clinical decision-making process: Interactive classification and topic discovery on diabetes-related biomedical literature (Preprint)

Export Citation Format

topic discoveryRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Hot Topic Discovery across Social Networks Based on Improved LDA Model

Unsupervised Topic Discovery in User Comments

Semantics Graph Mining for Topic Discovery and Word Associations

Detect Text Topics by Semantics Graphs

Public opinion hot topic discovery based on improved bacterial foraging algorithm

Topic Discovery on Farsi, English, French, and Arabic Tweets Related to COVID-19 Using Text Mining Techniques

SenU-PTM: a novel phrase-based topic model for short-text topic discovery by exploiting word embeddings

Classification aware neural topic model for COVID-19 disinformation categorisation

Indonesia Infrastructure Development Topic Discovery on Online News with Latent Dirichlet Allocation

Improving literature exploration in the clinical decision-making process: Interactive classification and topic discovery on diabetes-related biomedical literature (Preprint)

topic discovery
Recently Published Documents