Text Classification Based on Enriched Vector Space Model

Author(s):  
Tsvetanka Georgieva-Trifonova
IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 166578-166592
Author(s):  
Surender Singh Samant ◽  
N. L. Bhanu Murthy ◽  
Aruna Malapati

2013 ◽  
Vol 347-350 ◽  
pp. 2856-2859
Author(s):  
Jun Hui Pan ◽  
Hui Li

A kind of text classification method based on fuzzy vector space model and neural networks is proposed in the paper according to the problems that a text can be belongs to many types during the text classification. Fuzzy theory is adopted in the method to look the occurring position of feature items in text on as the important degree (membership) reflecting text subject, and fully considered the position information while the features are extracted, thus the fuzzy feature vectors are constructed, as a result, the text classification is close to the manual classification method. The established networks are constituted of input layer, hidden layer and output layer, the input layer completes the inputs of classification samples, hidden layer extracts the implicit pattern features of input samples, the output layer is used to output the classification results. Finally the effectiveness of this method is proved by some documents of Wan Fang data in experimental section. (Abstract)


Author(s):  
Jinguo Sang ◽  
Shanchen Pang ◽  
Yang Zha ◽  
Fan Yang

AbstractThe amount of information increases explosively in Internet of Things, because more and more data are sensed by large amount of sensors. The explosive growth of information makes it difficult to access information efficiently, so it is an effective method to decrease the amount of information to be transferred on network by text classification. This paper proposes a new text classification algorithm based on vector space model. This algorithm improves the feature selection and weighting methods by introducing synonym replacement to traditional text classification algorithms. The experimental results show that the proposed classification algorithm has considerably improved the precision and recall of classification.


Term Weighting Scheme (TWS) is a key component of the matching mechanism when using the vector space model In the context of information retrieval (IR) from text documents, the this paper described a new approach of term weighting methods to improve the classification performance. In this study, we propose an effective term weighting scheme, which gives highest accuracy with compare to the text classification methods. We compared performance parameter of KNN and Naïve Bayes Classification with different Weighting Method, Weight information gain, SVM and proposed method.We have implemented many term-weighting methods (TWM) on Amazon data collections in combination with Information-Gain and SVM and KNN algorithm and Naïve Bayes Algorithm.


2014 ◽  
Vol 644-650 ◽  
pp. 2206-2210
Author(s):  
Kun Zhou ◽  
Ya Ping Dai ◽  
Feng Gao ◽  
Ji Hong Zou

By means of word-segmentation technology in TRIP database and each word that appears in a database will be account in detail, a kind of self-constructed category dictionary (SCC-dictionary) in Chinese text classification is proposed. For solving high dimension and sparseness problem exit in vector space model, a four-dimensional feature vector space model (FFVSM) is presented in this paper. With Support Vector Machine (SVM) algorithm, the text classifier is designed. Experimental results show there are two achievements in this paper: first, SCC-dictionary can replace the artificial-written dictionary with the same effect; second, the FFVSM will not only reduce the computing load than high-dimensional feature vector space model, but also keep the precision of classification as 86.87%, recall rate as 95.12%, and F1 value as 90.81%.


Sign in / Sign up

Export Citation Format

Share Document