scholarly journals RANDOM WALK TERM WEIGHTING FOR IMPROVED TEXT CLASSIFICATION

2007 ◽  
Vol 01 (04) ◽  
pp. 421-439 ◽  
Author(s):  
SAMER HASSAN ◽  
RADA MIHALCEA ◽  
CARMEN BANEA

This paper describes a new approach for estimating term weights in a document, and shows how the new weighting scheme can be used to improve the accuracy of a text classifier. The method uses term co-occurrence as a measure of dependency between word features. A random walk model is applied on a graph encoding words and co-occurrence dependencies, resulting in scores that represent a quantification of how a particular word feature contributes to a given context. Experiments performed on three standard classification datasets show that the new random walk based approach outperforms the traditional term frequency approach of feature weighting.

2010 ◽  
Vol 121-122 ◽  
pp. 996-1001
Author(s):  
Shou Hui Pan ◽  
Li Wang ◽  
Ying Cheng Xu ◽  
Guo Ping Xia

Web text classification, as one of the fundamental techniques of web mining, plays an important role in the web mining system. An improved term weighting method is proposed in this paper. Besides term frequency, the location of the term is also considered when calculating the weight of a term. Web pages were divided into 4 text blocks and each text block has its location weight. Experimental result shows that the precision of improved term weighting method is higher than traditional term weighting method.


2010 ◽  
Vol 33 (8) ◽  
pp. 1418-1426 ◽  
Author(s):  
Wei ZHENG ◽  
Chao-Kun WANG ◽  
Zhang LIU ◽  
Jian-Min WANG

Sign in / Sign up

Export Citation Format

Share Document