dataset partitioning
Recently Published Documents


TOTAL DOCUMENTS

3
(FIVE YEARS 3)

H-INDEX

1
(FIVE YEARS 1)

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Jie Zhang ◽  
Pingping Sun ◽  
Feng Zhao ◽  
Qianru Guo ◽  
Yue Zou

The wanton dissemination of network pseudohealth information has brought great harm to people’s health, life, and property. It is important to detect and identify network pseudohealth information. Based on this, this paper defines the concepts of pseudohealth information, data block, and data block integration, designs an architecture that combines the latent Dirichlet allocation (LDA) algorithm and data block update integration, and proposes the combination algorithm model. In addition, crawler technology is used to crawl the pseudohealth information transmitted on the Sina Weibo platform during the “epidemic situation” from February to March 2020 for the simulation test on the experimental case dataset. The research results show that (1) the LDA model can deeply mine the semantic information of network pseudohealth information, obtain the features of document-topic distribution, and classify and train topic features as input variables; (2) the dataset partitioning method can effectively block data according to the text attributes and class labels of network pseudohealth information and can accurately classify and integrate the block data through the data block reintegration method; and (3) considering that the combination model has certain limitations on the detection of network pseudohealth information, the support vector machine (SVM) model can extract the granularity content of data blocks in pseudohealth information in real time, thus greatly improving the recognition performance of the combination model.


Author(s):  
Angello Hoyos ◽  
Ubaldo Ruiz ◽  
Stephane Marchand-Maillet ◽  
Edgar Chávez
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document