A Comparative Study on Feature Selection in Chinese Text Classification Problem
2013 ◽
Vol 380-384
◽
pp. 2854-2857
Keyword(s):
Information explosion brings lots of challenges to text classification. The dimension disaster led to a sharp increase of computational complexity and lower classification accuracy. Therefore, it is critical to use feature selection techniques before actual classification. Automatic classification of English text has been researched for many years, but little on Chinese text. In this paper, several classic feature selection methods, namely TF, IG and CHI, are compared on classifying Chinese text. Meanwhile, we take imbalanced data into consideration in the paper. Experimental results show that CHI performed better than IG and TF when the dataset is imbalanced, but no obvious difference on balanced data.
2011 ◽
Vol 10
(01)
◽
pp. 1-14
Keyword(s):
2019 ◽
Vol 1
(3)
◽
Keyword(s):
2020 ◽
Vol 2
(Special Issue ICSTM 12S)
◽
pp. 44-50
Keyword(s):
2022 ◽
pp. 423-436
Keyword(s):
Keyword(s):
2019 ◽
Vol 45
(1)
◽
pp. 11-14