Comparison and Improvements of Feature Extraction Methods for Text Categorization
2014 ◽
Vol 599-601
◽
pp. 1824-1828
Keyword(s):
Feature extraction is a key point of text categorization[1]. The accuracy of extraction will directly affect the accuracy of text classification. This paper introduces and compares 4 commonly used methods of text feature extraction: IG (Information gain), MI (Mutual information), CHI (statistics), DF (Document frequency), and proposes an improved method based on the method of CHI. Experiment result shows that the proposed method can improve the accuracy of text categorization.
2012 ◽
Vol 532-533
◽
pp. 1191-1195
◽
2021 ◽
Vol 18
(5)
◽
2014 ◽
Vol 1046
◽
pp. 444-448
◽
Keyword(s):
2020 ◽
pp. 485-496
2007 ◽
Vol 21
(07)
◽
pp. 1213-1231
◽
2014 ◽
Vol 519-520
◽
pp. 842-845
◽
Keyword(s):
2017 ◽
Vol 166
(11)
◽
pp. 11-17
◽
2019 ◽
Vol 10
(3)
◽
pp. 17-32
◽