scholarly journals Boosting KNN text classification accuracy by using supervised term weighting schemes

Author(s):  
Iyad Batal ◽  
Milos Hauskrecht
IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 166578-166592
Author(s):  
Surender Singh Samant ◽  
N. L. Bhanu Murthy ◽  
Aruna Malapati

2017 ◽  
Vol 58 ◽  
pp. 193-206 ◽  
Author(s):  
Thabit Sabbah ◽  
Ali Selamat ◽  
Md Hafiz Selamat ◽  
Fawaz S. Al-Anzi ◽  
Enrique Herrera Viedma ◽  
...  

2021 ◽  
Author(s):  
Chuanxiao Li ◽  
Wenqiang Li ◽  
Zhong Tang ◽  
Song Li ◽  
Hai Xiang

Abstract As a vital step of text classification (TC) task, the assignment of term weight has a great influence on the performance of TC. Currently, masses of term weighting schemes can be utilized, such as term frequency-inverse documents frequency (TF-IDF) and term frequency-relevance frequency (TF-RF), and they are all consisted of local part (TF) and global part (e.g., IDF, RF). However, most of these schemes adopt the logarithmic processing on their respective global parts, and it is natural to consider whether the logarithmic processing apply to all these schemes or not. Actually, for a specific term weighting scheme, due to its different ratio of local weight and global weight resulting from logarithmic processing, it usually shows diverse text clasification results on different text sets, which presents poor robustness. To explore the influence of logarithmic processing imposed on the global weight on the classification result of term weighting schemes, TF-RF is selected as the representative because it can achieve a better performance among these schemes adopting logarithmic processing. Then, two propositions along with corresponding methods about the relation between TF part and RF part are proposed based on TF-RF. In addition, two groups of experiments are conducted on the two methods. The first group of experiments proves that one method (denoted as TF-ERF) is more helpful to the improvement than the other one (denoted as ETF-RF). The second group of experiments shows that TF-ERF not only ourperforms TF-RF but also obtains better performance than other existing term weighting schemes.


Sign in / Sign up

Export Citation Format

Share Document