Study of a New Parallel K_NN Network Public Opinion Classification Algorithm Based on Hadoop Environment

2014 ◽  
Vol 635-637 ◽  
pp. 1624-1627
Author(s):  
Jian Xu ◽  
Bin Ma

A new kind of network public opinion classification method based on K_ nearest neighbor (K_NN) classification algorithm in Hadoop environment is studied in this paper. In the light of distributed storage and parallel processing Characteristics of Hadoop platform, the parallel K_NN classification algorithm in the frame of MapReduce is designed. The classification ability and execution efficiency of proposed scheme is verified and the results show that the parallel K_NN algorithm enhances the network public opinion classification precision and execution efficiently.

2014 ◽  
Vol 644-650 ◽  
pp. 2018-2021 ◽  
Author(s):  
Bin Ma

According to the network public opinion’s characteristics of distributed, massive and heterogeneous, a new kind of network public opinion classification method based on K_ nearest neighbor (K_NN) classification algorithm in Hadoop plateform is studied. The classification ability and execution efficiency of proposed scheme is verified and applied to the network public opinion documents classification test. The results show that the parallel K_NN algorithm can achieve rapid and accurate classification of network public opinion.


2014 ◽  
Vol 519-520 ◽  
pp. 58-61 ◽  
Author(s):  
Jian Xu ◽  
Bin Ma

In the light of the excellent distributed storage and parallel processing feature of hadoop cluster, a new kind of network public opinion classification method based on Naive Bayes algorithm in hadoop environment is studied. The collected public opinion documents are stored locally according to the HDFS architecture, and whose character words are extracted paralleled in Mapreduce process. Thus the naive Bayesian classification algorithm is parallel encapsulated on cloud computing platform. The MapReduce packaged Naive Bayesian classification algorithm performance is verified and the results show that the algorithm execution speed are significantly improved compared to a single server. Its public opinion classification accuracy rate is more than 85%, which can effectively improve the classification performance of network public opinion and classification efficiency.


Author(s):  
Yong Li ◽  
Xiaojun Yang ◽  
Min Zuo ◽  
Qingyu Jin ◽  
Haisheng Li ◽  
...  

The real-time and dissemination characteristics of network information make net-mediated public opinion become more and more important food safety early warning resources, but the data of petabyte (PB) scale growth also bring great difficulties to the research and judgment of network public opinion, especially how to extract the event role of network public opinion from these data and analyze the sentiment tendency of public opinion comment. First, this article takes the public opinion of food safety network as the research point, and a BLSTM-CRF model for automatically marking the role of event is proposed by combining BLSTM and conditional random field organically. Second, the Attention mechanism based on vocabulary in the field of food safety is introduced, the distance-related sequence semantic features are extracted by BLSTM, and the emotional classification of sequence semantic features is realized by using CNN. A kind of Att-BLSTM-CNN model for the analysis of public opinion and emotional tendency in the field of food safety is proposed. Finally, based on the time series, this article combines the role extraction of food safety events and the analysis of emotional tendency and constructs a net-mediated public opinion early warning model in the field of food safety according to the heat of the event and the emotional intensity of the public to food safety public opinion events.


Sign in / Sign up

Export Citation Format

Share Document