scholarly journals An Influence Prediction Model for Microblog Entries on Public Health Emergencies

2019 ◽  
Vol 3 (2) ◽  
pp. 102-115 ◽  
Author(s):  
Lu An ◽  
Xingyue Yi ◽  
Yuxin Han ◽  
Gang Li

Abstract This study aims at constructing a microblog influence prediction model and revealing how the user, time, and content features of microblog entries about public health emergencies affect the influence of microblog entries. Microblog entries about the Ebola outbreak are selected as data sets. The BM25 latent Dirichlet allocation model (LDA-BM25) is used to extract topics from the microblog entries. A microblog influence prediction model is proposed by using the random forest method. Results reveal that the proposed model can predict the influence of microblog entries about public health emergencies with a precision rate reaching 88.8%. The individual features that play a role in the influence of microblog entries, as well as their influence tendencies are also analyzed. The proposed microblog influence prediction model consists of user, time, and content features. It makes up the deficiency that content features are often ignored by other microblog influence prediction models. The roles of the three features in the influence of microblog entries are also discussed.

2019 ◽  
Vol 0 (0) ◽  
Author(s):  
Lu An ◽  
Xingyue Yi ◽  
Yuxin Han ◽  
Gang Li

Abstract This study aims at constructing a microblog influence prediction model and revealing how the user, time, and content features of microblog entries about public health emergencies affect the influence of microblog entries. Microblog entries about the Ebola outbreak are selected as data sets. The BM25 latent Dirichlet allocation model (LDA-BM25) is used to extract topics from the microblog entries. A microblog influence prediction model is proposed by using the random forest method. Results reveal that the proposed model can predict the influence of microblog entries about public health emergencies with a precision rate reaching 88.8%. The individual features that play a role in the influence of microblog entries, as well as their influence tendencies are also analyzed. The proposed microblog influence prediction model consists of user, time, and content features. It makes up the deficiency that content features are often ignored by other microblog influence prediction models. The roles of the three features in the influence of microblog entries are also discussed.


2018 ◽  
Vol 42 (6) ◽  
pp. 821-846
Author(s):  
Lu An ◽  
Chuanming Yu ◽  
Xia Lin ◽  
Tingyao Du ◽  
Liqin Zhou ◽  
...  

Purpose The purpose of this paper is to identify salient topic categories and outline their evolution patterns and temporal trends in microblogs on a public health emergency across different stages. Comparisons were also examined to reveal the similarities and differences between those patterns and trends on microblog platforms of different languages and from different nations. Design/methodology/approach A total of 459,266 microblog entries about the Ebola outbreak in West Africa in 2014 on Twitter and Weibo were collected for nine months after the inception of the outbreak. Topics were detected by the latent Dirichlet allocation model and classified into several categories. The daily tweets were analyzed with the self-organizing map technique and labeled with the most salient topics. The investigated time span was divided into three stages, and the most salient topic categories were identified for each stage. Findings In total, 14 salient topic categories were identified in microblogs about the Ebola outbreak and were summarized as increasing, decreasing, fluctuating or ephemeral types. The topical evolution patterns of microblogs and temporal trends for topic categories vary on different microblog platforms. Twitter users were keen on the dynamics of the Ebola outbreak, such as status description, secondary events and so forth, while Weibo users focused on background knowledge of Ebola and precautions. Originality/value This study revealed evolution patterns and temporal trends of microblog topics on a public health emergency. The findings can help administrators of public health emergencies and microblog communities work together to better satisfy information needs and physical demands by the public when public health emergencies are in progress.


2014 ◽  
Vol 2014 ◽  
pp. 1-10 ◽  
Author(s):  
WenBo Xie ◽  
Qiang Dong ◽  
Hui Gao

The recent decade has witnessed an increasing popularity of recommendation systems, which help users acquire relevant knowledge, commodities, and services from an overwhelming information ocean on the Internet. Latent Dirichlet Allocation (LDA), originally presented as a graphical model for text topic discovery, now has found its application in many other disciplines. In this paper, we propose an LDA-inspired probabilistic recommendation method by taking the user-item collecting behavior as a two-step process: every user first becomes a member of one latent user-group at a certain probability and each user-group will then collect various items with different probabilities. Gibbs sampling is employed to approximate all the probabilities in the two-step process. The experiment results on three real-world data sets MovieLens, Netflix, and Last.fm show that our method exhibits a competitive performance on precision, coverage, and diversity in comparison with the other four typical recommendation methods. Moreover, we present an approximate strategy to reduce the computing complexity of our method with a slight degradation of the performance.


2021 ◽  
Author(s):  
Jorge Arturo Lopez

Extraction of topics from large text corpuses helps improve Software Engineering (SE) processes. Latent Dirichlet Allocation (LDA) represents one of the algorithmic tools to understand, search, exploit, and summarize a large corpus of data (documents), and it is often used to perform such analysis. However, calibration of the models is computationally expensive, especially if iterating over a large number of topics. Our goal is to create a simple formula allowing analysts to estimate the number of topics, so that the top X topics include the desired proportion of documents under study. We derived the formula from the empirical analysis of three SE-related text corpuses. We believe that practitioners can use our formula to expedite LDA analysis. The formula is also of interest to theoreticians, as it suggests that different SE text corpuses have similar underlying properties.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5262
Author(s):  
Meizhu Li ◽  
Shaoguang Huang ◽  
Jasper De Bock ◽  
Gert de Cooman ◽  
Aleksandra Pižurica

Supervised hyperspectral image (HSI) classification relies on accurate label information. However, it is not always possible to collect perfectly accurate labels for training samples. This motivates the development of classifiers that are sufficiently robust to some reasonable amounts of errors in data labels. Despite the growing importance of this aspect, it has not been sufficiently studied in the literature yet. In this paper, we analyze the effect of erroneous sample labels on probability distributions of the principal components of HSIs, and provide in this way a statistical analysis of the resulting uncertainty in classifiers. Building on the theory of imprecise probabilities, we develop a novel robust dynamic classifier selection (R-DCS) model for data classification with erroneous labels. Particularly, spectral and spatial features are extracted from HSIs to construct two individual classifiers for the dynamic selection, respectively. The proposed R-DCS model is based on the robustness of the classifiers’ predictions: the extent to which a classifier can be altered without changing its prediction. We provide three possible selection strategies for the proposed model with different computational complexities and apply them on three benchmark data sets. Experimental results demonstrate that the proposed model outperforms the individual classifiers it selects from and is more robust to errors in labels compared to widely adopted approaches.


2014 ◽  
Vol 2014 ◽  
pp. 1-12 ◽  
Author(s):  
Zhou Su ◽  
Hua Wei ◽  
Sha Wei

Over the past decade, a wide attention has been paid to the crowd control and management in intelligent video surveillance area. Among the tasks of automatic video-based crowd management, crowd motion modeling is recognized as one of the most critical components, since it lays a crucial foundation for numerous subsequent analyses. However, it still encounters many unsolved challenges due to occlusions among pedestrians, complicated motion patterns in crowded scenarios, and so forth. Addressing these issues, we propose a novel spatiotemporal Weber field, which integrates both appearance characteristics and stimulus of crowd motion patterns, to recognize the large-scale crowd event. On the one hand, crowd motion is recognized as variations of spatiotemporal signal, and we then measure the variation based on Weber law. The result is referred to as spatiotemporal Weber variation feature. On the other hand, motivated by the achievements in crowd dynamics that crowd motion has a close relationship with interaction force, we propose a spatiotemporal Weber force feature to exploit the stimulus of crowd behaviors. Finally, we utilize the latent Dirichlet allocation model to establish the relationship between crowd events and crowd motion patterns. Experiments on PETS2009 and UMN databases demonstrate that our proposed method outperforms the previous methods for the large-scale crowd behavior perception.


Author(s):  
Grace Burleson ◽  
Jesse Austin-Breneman

Abstract Over the past 50 years, researchers have repeatedly proposed the establishment of a new interdisciplinary engineering field in Engineering for Global Development (EGD), whose analytical tools and design processes result in positive social impacts and poverty alleviation in a global development context. Within each discipline and research area, a growing body of work has sought to systematically create scientific knowledge in this area. However, a recent network analysis of Human-Centered Design plus Development research indicates that sub-communities are not collaborating at a high level and therefore the overall research agenda may lack cohesion. This paper presents a descriptive analysis of EGD research within mechanical engineering along four dimensions through a systematic literature review and secondary data analysis. Results from the review and a Latent Dirichlet Allocation model indicate EGD work in mechanical engineering draws upon research methodologies from a number of other fields and has low levels of consensus on technical terminology. These results suggest consensus in the broader interdisciplinary EGD field should be examined.


Sign in / Sign up

Export Citation Format

Share Document