Use of a Latent Topic Model for Characteristic Extraction from Health Checkup Questionnaire Data

Summary Objectives: When patients complete questionnaires during health checkups, many of their responses are subjective, making topic extraction difficult. Therefore, the purpose of this study was to develop a model capable of extracting appropriate topics from subjective data in questionnaires conducted during health checkups. Methods: We employed a latent topic model to group the lifestyle habits of the study participants and represented their responses to items on health checkup questionnaires as a probability model. For the probability model, we used latent Dirichlet allocation to extract 30 topics from the questionnaires. According to the model parameters, a total of 4381 study participants were then divided into groups based on these topics. Results from laboratory tests, including blood glucose level, triglycerides, and estimated glomerular filtration rate, were compared between each group, and these results were then compared with those obtained by hierarchical clustering. Results: If a significant (p < 0.05) difference was observed in any of the laboratory measurements between groups, it was considered to indicate a questionnaire response pattern corresponding to the value of the test result. A comparison between the latent topic model and hierarchical clustering grouping revealed that, in the latent topic model method, a small group of participants who reported having subjective signs of uri-nary disorder were allocated to a single group. Conclusions: The latent topic model is useful for extracting characteristics from a small number of groups from questionnaires with a large number of items. These results show that, in addition to chief complaints and history of past illness, questionnaire data obtained during medical checkups can serve as useful judgment criteria for assessing the conditions of patients.

Download Full-text

Latent Topic Estimation Based on Events in a Document

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2012.p0603 ◽

2012 ◽

Vol 16 (5) ◽

pp. 603-610

Author(s):

Risa Kitajima ◽

◽

Ichiro Kobayashi

Keyword(s):

Text Analysis ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Latent Semantic Indexing ◽

Document Retrieval ◽

Retrieval Task ◽

Latent Topic ◽

Latent Topics ◽

Definition Of ◽

The Relationship

Several latent topic model-based methods such as Latent Semantic Indexing (LSI), Probabilistic LSI (pLSI), and Latent Dirichlet Allocation (LDA) have been widely used for text analysis. These methods basically assign topics to words, however, and the relationship between words in a document is therefore not considered. Considering this, we propose a latent topic extraction method that assigns topics to events that represent the relation between words in a document. There are several ways to express events, and the accuracy of estimating latent topics differs depending on the definition of an event. We therefore propose five event types and examine which event type works well in estimating latent topics in a document with a common document retrieval task. As an application of our proposed method, we also show multidocument summarization based on latent topics. Through these experiments, we have confirmed that our proposed method results in higher accuracy than the conventional method.

Download Full-text

Incorporating Biterm Correlation Knowledge into Topic Modeling for Short Texts

The Computer Journal ◽

10.1093/comjnl/bxaa079 ◽

2020 ◽

Author(s):

Kai Zhang ◽

Yuan Zhou ◽

Zheng Chen ◽

Yufei Liu ◽

Zhuo Tang ◽

...

Keyword(s):

Topic Modeling ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Semantic Knowledge ◽

Superior Performance ◽

Knowledge Based ◽

Modeling Process ◽

Proposed Model ◽

Benchmark Datasets ◽

Latent Topic

Abstract The prevalence of short texts on the Web has made mining the latent topic structures of short texts a critical and fundamental task for many applications. However, due to the lack of word co-occurrence information induced by the content sparsity of short texts, it is challenging for traditional topic models like latent Dirichlet allocation (LDA) to extract coherent topic structures on short texts. Incorporating external semantic knowledge into the topic modeling process is an effective strategy to improve the coherence of inferred topics. In this paper, we develop a novel topic model—called biterm correlation knowledge-based topic model (BCK-TM)—to infer latent topics from short texts. Specifically, the proposed model mines biterm correlation knowledge automatically based on recent progress in word embedding, which can represent semantic information of words in a continuous vector space. To incorporate external knowledge, a knowledge incorporation mechanism is designed over the latent topic layer to regularize the topic assignment of each biterm during the topic sampling process. Experimental results on three public benchmark datasets illustrate the superior performance of the proposed approach over several state-of-the-art baseline models.

Download Full-text

Discriminative Action Recognition Using Supervised Latent Topic Model

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.190-191.1125 ◽

2012 ◽

Vol 190-191 ◽

pp. 1125-1128

Author(s):

Huan Xin Zou ◽

Hao Sun ◽

Ke Feng Ji

Keyword(s):

Action Recognition ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Human Action Recognition ◽

Human Action ◽

Video Sequences ◽

Discriminative Learning ◽

Action Categorization ◽

Latent Topic ◽

Topic Structure

We present a discriminative learning method for human action recognition from video sequences. Our model combines a bag-of-words component with supervised latent topic models. The supervised latent Dirichlet allocation (sLDA) topic model, which employs discriminative learning using labeled data under a generative framework, is introduced to discover the latent topic structure which is most relevant to action categorization. We test our algorithm on two challenging datasets. Experimental results demonstrate the effectiveness of our algorithm.

Download Full-text

Discovering research topics from library electronic references using latent Dirichlet allocation

Library Hi Tech ◽

10.1108/lht-06-2017-0132 ◽

2018 ◽

Vol 36 (3) ◽

pp. 400-410 ◽

Cited By ~ 7

Author(s):

Debin Fang ◽

Haixia Yang ◽

Baojun Gao ◽

Xiaojun Li

Keyword(s):

Latent Dirichlet Allocation ◽

Lower Cost ◽

Topic Model ◽

Research Topics ◽

Content Type ◽

Significant Research ◽

Latent Topic ◽

Automatic Text Analysis ◽

Automatic Text ◽

Dirichlet Allocation

Purpose Discovering the research topics and trends from a large quantity of library electronic references is essential for scientific research. Current research of this kind mainly depends on human justification. The purpose of this paper is to demonstrate how to identify research topics and evolution in trends from library electronic references efficiently and effectively by employing automatic text analysis algorithms. Design/methodology/approach The authors used the latent Dirichlet allocation (LDA), a probabilistic generative topic model to extract the latent topic from the large quantity of research abstracts. Then, the authors conducted a regression analysis on the document-topic distributions generated by LDA to identify hot and cold topics. Findings First, this paper discovers 32 significant research topics from the abstracts of 3,737 articles published in the six top accounting journals during the period of 1992-2014. Second, based on the document-topic distributions generated by LDA, the authors identified seven hot topics and six cold topics from the 32 topics. Originality/value The topics discovered by LDA are highly consistent with the topics identified by human experts, indicating the validity and effectiveness of the methodology. Therefore, this paper provides novel knowledge to the accounting literature and demonstrates a methodology and process for topic discovery with lower cost and higher efficiency than the current methods.

Download Full-text

Understanding Twitter Hashtags from Latent Themes Using Biterm Topic Model

Recent Patents on Engineering ◽

10.2174/1872212113666190328183517 ◽

2019 ◽

Vol 13 ◽

Author(s):

Muzafar Rasool Bhat ◽

Burhan Bashir ◽

Majid A. Kundroo ◽

Naffi A. Ahanger

Keyword(s):

Social Media ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Social Issues ◽

Text Messages ◽

Short Message ◽

Short Text ◽

Contemporary Issue ◽

Significant Length ◽

Latent Topic

Social media in general and Twitter in particular provides a space for discourses, contemporary narratives besides a discussion about few specific social issues. People respond to these events by writing short text messages. Background: Hashtag “#”, a specific way to respond to a given raised discourse, narrative or any contemporary issue is usual to social media. Netizens write a short message as their opinion about any given issue represented using a given Hashtag. These small messages generally tend to have a latent topic (theme) as one’s opinion about it. Objective: This research is aimed to extract, represent and understand those hidden themes Method: Biterm Topic Model (BTM) has been used in this study given its ability to deal with the short messages unlike Latent Dirichlet Allocation that expects a document to have a significant length. Results: Twitter Hashtag #MeToo has been used in this research with forty thousand (40,000) comments. Data has been modelled with ten (10) topics after verifying suitable number of topics from four metrics Griffths, CaoJuan, Arun and Deveaud. Conclusion: The experimental results show that the proposed approach to understand the twittter hashtages from latent themes using biterm topic modelling method is very effective as compared to other methods

Download Full-text

Intelligent radar software defect classification approach based on the latent Dirichlet allocation topic model

EURASIP Journal on Advances in Signal Processing ◽

10.1186/s13634-021-00761-3 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Xi Liu ◽

Yongfeng Yin ◽

Haifeng Li ◽

Jiabin Chen ◽

Chang Liu ◽

...

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

Recall Rate ◽

Defect Classification ◽

Software Defects ◽

Classification Approach ◽

Software Defect ◽

Model Combining ◽

Dirichlet Allocation

AbstractExisting software intelligent defect classification approaches do not consider radar characters and prior statistics information. Thus, when applying these appaoraches into radar software testing and validation, the precision rate and recall rate of defect classification are poor and have effect on the reuse effectiveness of software defects. To solve this problem, a new intelligent defect classification approach based on the latent Dirichlet allocation (LDA) topic model is proposed for radar software in this paper. The proposed approach includes the defect text segmentation algorithm based on the dictionary of radar domain, the modified LDA model combining radar software requirement, and the top acquisition and classification approach of radar software defect based on the modified LDA model. The proposed approach is applied on the typical radar software defects to validate the effectiveness and applicability. The application results illustrate that the prediction precison rate and recall rate of the poposed approach are improved up to 15 ~ 20% compared with the other defect classification approaches. Thus, the proposed approach can be applied in the segmentation and classification of radar software defects effectively to improve the identifying adequacy of the defects in radar software.

Download Full-text

Research progress and trend of leader member exchange based on social complex network and latent dirichlet allocation topic model

2020 2nd International Conference on Economic Management and Model Engineering (ICEMME) ◽

10.1109/icemme51517.2020.00090 ◽

2020 ◽

Author(s):

Zhang chunyang ◽

Ding kun ◽

Zhang chunbo ◽

Zhang li

Keyword(s):

Complex Network ◽

Latent Dirichlet Allocation ◽

Topic Model ◽

Research Progress ◽

Leader Member Exchange ◽

Member Exchange ◽

Dirichlet Allocation

Download Full-text

A review of program evaluations in an Australian independent school: Participants’ perspectives

Australian Journal of Education ◽

10.1177/0004944114542983 ◽

2014 ◽

Vol 58 (3) ◽

pp. 262-277

Author(s):

Jeanne Maree Allen ◽

Julie Rimes

Keyword(s):

Program Evaluation ◽

Academic Integrity ◽

Best Practice ◽

Independent School ◽

Evaluation Methods ◽

Questionnaire Data ◽

School Program ◽

Program Evaluations ◽

Documentary Analysis ◽

Study Participants

This article reports on ways in which one Australian independent school seeks to develop and sustain best practice and academic integrity in its programs through a system of ongoing program evaluation, involving a systematic, cyclical appraisal of the school’s suite of six faculties. A number of different evaluation methods have been and continue to be used, each developed to best suit the particular program under evaluation. In order to gain an understanding of the effectiveness of this process, we conducted a study into participants’ perceptions of the strengths and weaknesses of the four program evaluations undertaken between 2009 and 2011. Drawing on documentary analysis of the evaluation reports and analysis of questionnaire data from the study participants, a number of findings were generated. These findings are provided and discussed, together with suggestions about ways in which the conceptualisation and conduct of school program evaluations might be enhanced.

Download Full-text

Technology Hotspot Tracking: Topic Discovery and Evolution of China’s Blockchain Patents Based on a Dynamic LDA Model

Symmetry ◽

10.3390/sym13030415 ◽

2021 ◽

Vol 13 (3) ◽

pp. 415

Author(s):

Jinli Wang ◽

Yong Fan ◽

Hui Zhang ◽

Libo Feng

Keyword(s):

Latent Dirichlet Allocation ◽

Topic Model ◽

Graph Model ◽

Representation Learning ◽

Research Direction ◽

Calculation Model ◽

Topic Evolution ◽

Blockchain Technology ◽

The Status ◽

Research Hotspots

Tracking scientific and technological (S&T) research hotspots can help scholars to grasp the status of current research and develop regular patterns in the field over time. It contributes to the generation of new ideas and plays an important role in promoting the writing of scientific research projects and scientific papers. Patents are important S&T resources, which can reflect the development status of the field. In this paper, we use topic modeling, topic intensity, and evolutionary computing models to discover research hotspots and development trends in the field of blockchain patents. First, we propose a time-based dynamic latent Dirichlet allocation (TDLDA) modeling method based on a probabilistic graph model and knowledge representation learning for patent text mining. Second, we present a computational model, topic intensity (TI), that expresses the topic strength and evolution. Finally, the point-wise mutual information (PMI) value is used to evaluate topic quality. We obtain 20 hot topics through TDLDA experiments and rank them according to the strength calculation model. The topic evolution model is used to analyze the topic evolution trend from the perspectives of rising, falling, and stable. From the experiments we found that 8 topics showed an upward trend, 6 topics showed a downward trend, and 6 topics became stable or fluctuated. Compared with the baseline method, TDLDA can have the best effect when K is 40 or less. TDLDA is an effective topic model that can extract hot topics and evolution trends of blockchain patent texts, which helps researchers to more accurately grasp the research direction and improves the quality of project application and paper writing in the blockchain technology domain.

Download Full-text

Latent Topic Model for Indexing Arabic Documents

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2014010102 ◽

2014 ◽

Vol 4 (1) ◽

pp. 29-45 ◽

Cited By ~ 3

Author(s):

Rami Ayadi ◽

Mohsen Maraoui ◽

Mounir Zrigui

Keyword(s):

Topic Model ◽

Inflectional Morphology ◽

Arabic Text ◽

Text Representation ◽

Text Documents ◽

Latent Topic ◽

Latent Topics ◽

F Measure

In this paper, the authors present latent topic model to index and represent the Arabic text documents reflecting more semantics. Text representation in a language with high inflectional morphology such as Arabic is not a trivial task and requires some special treatments. The authors describe our approach for analyzing and preprocessing Arabic text then we describe the stemming process. Finally, the latent model (LDA) is adapted to extract Arabic latent topics, the authors extracted significant topics of all texts, each theme is described by a particular distribution of descriptors then each text is represented on the vectors of these topics. The experiment of classification is conducted on in house corpus; latent topics are learned with LDA for different topic numbers K (25, 50, 75, and 100) then the authors compare this result with classification in the full words space. The results show that performances, in terms of precision, recall and f-measure, of classification in the reduced topics space outperform classification in full words space and when using LSI reduction.

Download Full-text