Arabic Text Mining a Systematic Review of the Published Literature 2002-2014

Author(s):  
Hind Al-Mahmoud ◽  
Muna Al-Razgan
Author(s):  
Nourah F. Bin Hathlian ◽  
Alaaeldin M. Hafez

The need for designing Arabic text mining systems for the use on social media posts is increasingly becoming a significant and attractive research area. It serves and enhances the knowledge needed in various domains. The main focus of this paper is to propose a novel framework combining sentiment analysis with subjective analysis on Arabic social media posts to determine whether people are interested or not interested in a defined subject. For those purposes, text classification methods—including preprocessing and machine learning mechanisms—are applied. Essentially, the performance of the framework is tested using Twitter as a data source, where possible volunteers on a certain subject are identified based on their posted tweets along with their subject-related information. Twitter is considered because of its popularity and its rich content from online microblogging services. The results obtained are very promising with an accuracy of 89%, thereby encouraging further research.


2018 ◽  
Vol 22 (7) ◽  
pp. 1471-1488 ◽  
Author(s):  
Antonio Usai ◽  
Marco Pironti ◽  
Monika Mital ◽  
Chiraz Aouina Mejri

Purpose The aim of this work is to increase awareness of the potential of the technique of text mining to discover knowledge and further promote research collaboration between knowledge management and the information technology communities. Since its emergence, text mining has involved multidisciplinary studies, focused primarily on database technology, Web-based collaborative writing, text analysis, machine learning and knowledge discovery. However, owing to the large amount of research in this field, it is becoming increasingly difficult to identify existing studies and therefore suggest new topics. Design/methodology/approach This article offers a systematic review of 85 academic outputs (articles and books) focused on knowledge discovery derived from the text mining technique. The systematic review is conducted by applying “text mining at the term level, in which knowledge discovery takes place on a more focused collection of words and phrases that are extracted from and label each document” (Feldman et al., 1998, p. 1). Findings The results revealed that the keywords extracted to be associated with the main labels, id est, knowledge discovery and text mining, can be categorized in two periods: from 1998 to 2009, the term knowledge and text were always used. From 2010 to 2017 in addition to these terms, sentiment analysis, review manipulation, microblogging data and knowledgeable users were the other terms frequently used. Besides this, it is possible to notice the technical, engineering nature of each term present in the first decade. Whereas, a diverse range of fields such as business, marketing and finance emerged from 2010 to 2017 owing to a greater interest in the online environment. Originality/value This is a first comprehensive systematic review on knowledge discovery and text mining through the use of a text mining technique at term level, which offers to reduce redundant research and to avoid the possibility of missing relevant publications.


Author(s):  
Ibtissam El Hassani ◽  
Abdelaziz Kriouile ◽  
Youssef BenGhabrit
Keyword(s):  

2015 ◽  
Vol 4 (1) ◽  
Author(s):  
Alison O’Mara-Eves ◽  
James Thomas ◽  
John McNaught ◽  
Makoto Miwa ◽  
Sophia Ananiadou

2014 ◽  
Vol 41 (16) ◽  
pp. 7653-7670 ◽  
Author(s):  
Arman Khadjeh Nassirtoussi ◽  
Saeed Aghabozorgi ◽  
Teh Ying Wah ◽  
David Chek Ling Ngo

2012 ◽  
Vol 11 (01) ◽  
pp. 1250006 ◽  
Author(s):  
Fadi Thabtah ◽  
Omar Gharaibeh ◽  
Rashid Al-Zubaidy

A well-known classification problem in the domain of text mining is text classification, which concerns about mapping textual documents into one or more predefined category based on its content. Text classification arena recently attracted many researchers because of the massive amounts of online documents and text archives which hold essential information for a decision-making process. In this field, most of such researches focus on classifying English documents while there are limited studies conducted on other languages like Arabic. In this respect, the paper proposes to investigate the problem of Arabic text classification comprehensively. More specifically the study measures the performance of different rule based classification approaches adopted from machine learning and data mining towards the problem of text Arabic classification. In particular, four different rule based classification approaches: Decision trees (C4.5), Rule Induction (RIPPER), Hybrid (PART) and Simple Rule (One Rule) are evaluated against the published Corpus of Contemporary Arabic Arabic text collection. This experimentation is carried out by employing a modified version of WEKA business intelligence tool. Through analysing the produced results from the experimentation, we determine the most suitable classification algorithms for classifying Arabic texts.


Sign in / Sign up

Export Citation Format

Share Document