scholarly journals Correction to: Benchmarking performance of machine and deep learning-based methodologies for Urdu text document classification

Author(s):  
Muhammad Nabeel Asim ◽  
Muhammad Usman Ghani ◽  
Muhammad Ali Ibrahim ◽  
Waqar Mahmood ◽  
Andreas Dengel ◽  
...  
Author(s):  
Muhammad Nabeel Asim ◽  
Muhammad Usman Ghani ◽  
Muhammad Ali Ibrahim ◽  
Waqar Mahmood ◽  
Andreas Dengel ◽  
...  

Algorithms ◽  
2021 ◽  
Vol 14 (7) ◽  
pp. 216
Author(s):  
Abdullah Y. Muaad ◽  
Hanumanthappa Jayappa ◽  
Mugahed A. Al-antari ◽  
Sungyoung Lee

Arabic text classification is a process to simultaneously categorize the different contextual Arabic contents into a proper category. In this paper, a novel deep learning Arabic text computer-aided recognition (ArCAR) is proposed to represent and recognize Arabic text at the character level. The input Arabic text is quantized in the form of 1D vectors for each Arabic character to represent a 2D array for the ArCAR system. The ArCAR system is validated over 5-fold cross-validation tests for two applications: Arabic text document classification and Arabic sentiment analysis. For document classification, the ArCAR system achieves the best performance using the Alarabiya-balance dataset in terms of overall accuracy, recall, precision, and F1-score by 97.76%, 94.08%, 94.16%, and 94.09%, respectively. Meanwhile, the ArCAR performs well for Arabic sentiment analysis, achieving the best performance using the hotel Arabic reviews dataset (HARD) balance dataset in terms of overall accuracy and F1-score by 93.58% and 93.23%, respectively. The proposed ArCAR seems to provide a practical solution for accurate Arabic text representation, understanding, and classification.


Author(s):  
Emmanuel Buabin

The objective is intelligent recommender system classification unit design using hybrid neural techniques. In particular, a neuroscience-based hybrid neural by Buabin (2011a) is introduced, explained, and examined for its potential in real world text document classification on the modapte version of the Reuters news text corpus. The so described neuroscience model (termed Hy-RNC) is fully integrated with a novel boosting algorithm to augment text document classification purposes. Hy-RNC outperforms existing works and opens up an entirely new research field in the area of machine learning. The main contribution of this book chapter is the provision of a step-by-step approach to modeling the hybrid system using underlying concepts such as boosting algorithms, recurrent neural networks, and hybrid neural systems. Results attained in the experiments show impressive performance by the hybrid neural classifier even with a minimal number of neurons in constituting structures.


2018 ◽  
Vol 5 (4) ◽  
pp. 1-31 ◽  
Author(s):  
Shalini Puri ◽  
Satya Prakash Singh

In recent years, many information retrieval, character recognition, and feature extraction methodologies in Devanagari and especially in Hindi have been proposed for different domain areas. Due to enormous scanned data availability and to provide an advanced improvement of existing Hindi automated systems beyond optical character recognition, a new idea of Hindi printed and handwritten document classification system using support vector machine and fuzzy logic is introduced. This first pre-processes and then classifies textual imaged documents into predefined categories. With this concept, this article depicts a feasibility study of such systems with the relevance of Hindi, a survey report of statistical measurements of Hindi keywords obtained from different sources, and the inherent challenges found in printed and handwritten documents. The technical reviews are provided and graphically represented to compare many parameters and estimate contents, forms and classifiers used in various existing techniques.


2014 ◽  
Vol 905 ◽  
pp. 528-532
Author(s):  
Hoan Manh Dau ◽  
Ning Xu

Text document classification is content analysis task of the text document and then giving decision (or giving a prediction) whether this text document belongs to which group among given text document ones. There are many classification techniques such as decision method basing on Naive Bayer, decision tree, k-Nearest neighbor (KNN), neural network, Support Vector Machine (SVM) method. Among those techniques, SVM is considered the popular and powerful one, especially, it is suitable to huge and multidimensional data classification. Text document classification with characteristics of very huge dimensional numbers and selecting features before classifying impact the classification results. Support Vector Machine is a very effective method in this field. This article studies Support Vector Machine and applies it in the problem of text document classification. The study shows that Support Vector Machine method with choosing features by singular value decomposition (SVD) method is better than other methods and decision tree.


Sign in / Sign up

Export Citation Format

Share Document