Study on the Method of Feature Selection Based on Hybrid Model for Text Classification

2012 ◽  
Vol 433-440 ◽  
pp. 2881-2886 ◽  
Author(s):  
Run Zhi Li ◽  
Yang Sen Zhang

In this paper, we study on the problem of how to combine feature selection models in text classification ,and present a method through build the hybrid model for feature selection ,this hybrid model combined with advantage of four feature selection models (DF,MI, IG, CHI), then we use the Naive Bayes model as classifier to verify the effect of the hybrid feature selelction model ,and experiments shows that the hybrid model is correct and effective and get good performance in text classification.

2012 ◽  
Vol 6-7 ◽  
pp. 576-582
Author(s):  
Ping Li ◽  
Ming Liang Cui ◽  
Zhen Shan Hou ◽  
Liu Liu Wei ◽  
Wen Hao Ying ◽  
...  

Session segmentation can not only contribute a lot to the further and deeper analysis of user’s search behavior but also act as the foundation of other retrieval process researches based on users’ complicated search behaviors. This paper proposes a session boundary discrimination model utilizing time interval and query likelihood on the basis of Naive Bayes Model. Compared with previous study, the model proposed in this paper shows a prominent improvement through experiment in three aspects, which is: recall ratio, precision ratio and value F. Owing to its advantage in session boundary discrimination, the application of the model can serve as a tool in fields like personalized information retrieval, query suggestion, search activity analysis and other fields which is related to search results improvement.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 217917-217927
Author(s):  
Dashe Li ◽  
Jiajun Sun ◽  
Huanhai Yang ◽  
Xueying Wang

2020 ◽  
Vol 541 ◽  
pp. 316-331
Author(s):  
Si-Yuan Liu ◽  
Jing Xiao ◽  
Xiao-Ke Xu

2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Mengmeng Wang ◽  
Wanli Zuo ◽  
Ying Wang

Today microblogging has increasingly become a means of information diffusion via user’s retweeting behavior. Since retweeting content, as context information of microblogging, is an understanding of microblogging, hence, user’s retweeting sentiment tendency analysis has gradually become a hot research topic. Targeted at online microblogging, a dynamic social network, we investigate how to exploit dynamic retweeting sentiment features in retweeting sentiment tendency analysis. On the basis of time series of user’s network structure information and published text information, we first model dynamic retweeting sentiment features. Then we build Naïve Bayes models from profile-, relationship-, and emotion-based dimensions, respectively. Finally, we build a multilayer Naïve Bayes model based on multidimensional Naïve Bayes models to analyze user’s retweeting sentiment tendency towards a microblog. Experiments on real-world dataset demonstrate the effectiveness of the proposed framework. Further experiments are conducted to understand the importance of dynamic retweeting sentiment features and temporal information in retweeting sentiment tendency analysis. What is more, we provide a new train of thought for retweeting sentiment tendency analysis in dynamic social networks.


2012 ◽  
Vol 19B (3) ◽  
pp. 195-200
Author(s):  
Jae-Hoon Kim ◽  
Kil-Ho Jeon

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 57868-57880 ◽  
Author(s):  
Longjie Li ◽  
Shijin Xu ◽  
Mingwei Leng ◽  
Shiyu Fang ◽  
Xiaoyun Chen

Author(s):  
Neeraj Saxena ◽  
Ruiyang Wang ◽  
Vinayak V. Dixit ◽  
S. Travis Waller

Driving in congested traffic is a nuisance that not only results in longer travel times, but also triggers frustration and impatience among drivers. A few studies have modeled the effects of congested traffic in the resulting route choice behavior of car drivers. The studies used frequentist models such as discrete choice models to analyze large samples. However, these studies did not compare the inferences obtained from the frequentist and Bayesian approaches, particularly for datasets which are not sufficiently large. It has been shown by researchers that Bayesian models perform well, especially when the sample size is small. Thus, this paper develops and compares a multinomial logit (frequentist) and a Naïve Bayes (Bayesian) model on a mid-sized dataset of size around 100 participants which was obtained from a driving simulator experiment to understand driver’s route choice under stop-and-go traffic. The results show that the prediction power of the Naïve Bayes model is much higher than the multinomial logit model (MNL). The Naïve Bayes model is also found to perform better than machine learning algorithms like the decision tree model. The findings from this study will be useful to researchers and practitioners as they should test both the approaches and select the appropriate model, particularly in the case of seemingly large datasets.


Author(s):  
Arun Solanki ◽  
Rajat Saxena

With the advent of neural networks and its subfields like deep neural networks and convolutional neural networks, it is possible to make text classification predictions with high accuracy. Among the many subtypes of naive Bayes, multinomial naive Bayes is used for text classification. Many attempts have been made to somehow develop an algorithm that uses the simplicity of multinomial naive Bayes and at the same time incorporates feature dependency. One such effort was put in structure extended multinomial naive Bayes, which uses one-dependence estimators to inculcate dependencies. Basically, one-dependence estimators take one of the attributes as features and all other attributes as its child. This chapter proposes self structure extended multinomial naïve Bayes, which presents a hybrid model, a combination of the multinomial naive Bayes and structure extended multinomial naive Bayes. Basically, it tries to classify the instances that were misclassified by structure extended multinomial naive Bayes as there was no direct dependency between attributes.


Sign in / Sign up

Export Citation Format

Share Document