An Improved Multi-classification Algorithm for Imbalanced Online Public Opinion Data

Author(s):  
Xige Dang ◽  
Xu Wu ◽  
Xiaqing Xie ◽  
Tianle Zhang
2014 ◽  
Vol 519-520 ◽  
pp. 58-61 ◽  
Author(s):  
Jian Xu ◽  
Bin Ma

In the light of the excellent distributed storage and parallel processing feature of hadoop cluster, a new kind of network public opinion classification method based on Naive Bayes algorithm in hadoop environment is studied. The collected public opinion documents are stored locally according to the HDFS architecture, and whose character words are extracted paralleled in Mapreduce process. Thus the naive Bayesian classification algorithm is parallel encapsulated on cloud computing platform. The MapReduce packaged Naive Bayesian classification algorithm performance is verified and the results show that the algorithm execution speed are significantly improved compared to a single server. Its public opinion classification accuracy rate is more than 85%, which can effectively improve the classification performance of network public opinion and classification efficiency.


2014 ◽  
Vol 635-637 ◽  
pp. 1624-1627
Author(s):  
Jian Xu ◽  
Bin Ma

A new kind of network public opinion classification method based on K_ nearest neighbor (K_NN) classification algorithm in Hadoop environment is studied in this paper. In the light of distributed storage and parallel processing Characteristics of Hadoop platform, the parallel K_NN classification algorithm in the frame of MapReduce is designed. The classification ability and execution efficiency of proposed scheme is verified and the results show that the parallel K_NN algorithm enhances the network public opinion classification precision and execution efficiently.


2011 ◽  
Vol 268-270 ◽  
pp. 1115-1120
Author(s):  
De Qian Xue

Semi-supervised Support Vector Data Description multi-classification algorithm is presented, in order to solve less labeled data learning, difficulties in the implementation and poor results of semi-supervised multi-classification, which full use the distribution of information in of non-target samples. S3VDD-MC algorithm defines the degree of membership of non-target samples, in order to get the non-target samples’ accepted labels or refused labels, on this basis, several super-spheres constructed, a k-classification problem is transformed into k SVDDs problem. Finally, the simulation results verify the effectiveness of the algorithm.


2011 ◽  
Vol 55-57 ◽  
pp. 1803-1806 ◽  
Author(s):  
Bao Ling Liu

The paper presented the improved “one to many” classification algorithm in the basis of analyzing the shortcoming of the two traditional multi-classification algorithm, and established multi-fault classifier based on SVM to class the turbine typical faults. The results shows that the classifier may get satisfied effect.


Sign in / Sign up

Export Citation Format

Share Document