Text Classification Combined an Improved CHI and Category Relevance Factor
Text classification is the task of assigning natural language textual documents to predefined categories based on their context. The main concern in this paper is to improve the accuracy of text classification system combined an improved CHI method and category relevance factor. Firstly, use an improved CHI method to select features from the raw features aim to reduce the dimensions of the features. Secondly, through the TF-CRF method to calculate the feature weight, this method mainly consider that the features have different distributions in different categories. Finally, we carried out a series of experiments compared with other methods using the F1-measure. Experimental results show that our new method makes an important improvement in all categories.