A Comprehensive Study for the Hindi Language to Implement Supervised Text Classification Techniques

Author(s):  
Vijay Kumar Soni ◽  
Smita Selot
2014 ◽  
Vol 30 (8) ◽  
pp. 1120-1128 ◽  
Author(s):  
Ha X. Dang ◽  
Christopher B. Lawrence

2019 ◽  
Vol 3 (4) ◽  
pp. 53
Author(s):  
Ahmad Hawalah

Text classification is a process of classifying textual contents to a set of predefined classes and categories. As enormous numbers of documents and contextual contents are introduced every day on the Internet, it becomes essential to use text classification techniques for different purposes such as enhancing search retrieval and recommendation systems. A lot of work has been done to study different aspects of English text classification techniques. However, little attention has been devoted to study Arabic text classification due to the difficulty of processing Arabic language. Consequently, in this paper, we propose an enhanced Arabic topic-discovery architecture (EATA) that can use ontology to provide an effective Arabic topic classification mechanism. We have introduced a semantic enhancement model to improve Arabic text classification and the topic discovery technique by utilizing the rich semantic information in Arabic ontology. We rely in this study on the vector space model (term frequency-inverse document frequency (TF-IDF)) as well as the cosine similarity approach to classify new Arabic textual documents.


Sign in / Sign up

Export Citation Format

Share Document