scholarly journals Cloud computing architecture for Tagging Arabic Text Using Hybrid Model

2021 ◽  
pp. 141-151
Author(s):  
WASIN ALKISHRI ◽  
Mohammed Almutoory

With the increasing role of technology in transferring information in our daily lives, the Arabic language has become the fourth language used on the Internet. Therefore, to develop different information systems in the Arabic language, we should determine the syntax and semantics of creating a text efficiently and accurately. Part of speech (POS) is one of the primary methods employed to develop any language corpus. Each language consists of several tags applied in different applications, such as natural language processing (NLP), speech synthesis, and information extraction.  One of the main benefits of adopting cloud computing services is the offer a low cost and time to store your company data compared to traditional methods. This paper presents and deploys a cloud computing architecture for Tagging Arabic text using a hybrid model, which will help reduce the efforts and cost. The results show an excellent accuracy rate in tagging an Arabic text and quickly respond. Previous studies are compared based on relevant rating factors, which achieved high accuracy, procession, and recall rate of more than 95%.  The cloud computing tagger attained an accuracy of 99.2%.

Author(s):  
Ahed M. F. Al-Sbou

<p>There is a huge content of Arabic text available over online that requires an organization of these texts. As result, here are many applications of natural languages processing (NLP) that concerns with text organization. One of the is text classification (TC). TC helps to make dealing with unorganized text. However, it is easier to classify them into suitable class or labels. This paper is a survey of Arabic text classification. Also, it presents comparison among different methods in the classification of Arabic texts, where Arabic text is represented a complex text due to its vocabularies. Arabic language is one of the richest languages in the world, where it has many linguistic bases. The researche in Arabic language processing is very few compared to English. As a result, these problems represent challenges in the classification, and organization of specific Arabic text. Text classification (TC) helps to access the most documents, or information that has already classified into specific classes, or categories to one or more classes or categories. In addition, classification of documents facilitate search engine to decrease the amount of document to, and then to become easier to search and matching with queries.</p>


2019 ◽  
Vol 5 (5) ◽  
pp. 212-215
Author(s):  
Abeer AlArfaj

Semantic relation extraction is an important component of ontologies that can support many applications e.g. text mining, question answering, and information extraction. However, extracting semantic relations between concepts is not trivial and one of the main challenges in Natural Language Processing (NLP) Field. The Arabic language has complex morphological, grammatical, and semantic aspects since it is a highly inflectional and derivational language, which makes task even more challenging. In this paper, we present a review of the state of the art for relation extraction from texts, addressing the progress and difficulties in this field. We discuss several aspects related to this task, considering the taxonomic and non-taxonomic relation extraction methods. Majority of relation extraction approaches implement a combination of statistical and linguistic techniques to extract semantic relations from text. We also give special attention to the state of the work on relation extraction from Arabic texts, which need further progress.


Author(s):  
Tarek Kanan ◽  
Bilal Hawashin ◽  
Shadi Alzubi ◽  
Eyad Almaita ◽  
Ahmad Alkhatib ◽  
...  

Introduction: Stemming is an important preprocessing step in text classification, and could contribute in increasing text classification accuracy. Although many works proposed stemmers for English language, few stemmers were proposed for Arabic text. Arabic language has gained increasing attention in the previous decades and the need is vital to further improve Arabic text classification. Method: This work combined the use of the recently proposed P-Stemmer with various classifiers to find the optimal classifier for the P-stemmer in term of Arabic text classification. As part of this work, a synthesized dataset was collected. Result: The previous experiments show that the use of P-Stemmer has a positive effect on classification. The degree of improvement was classifier-dependent, which is reasonable as classifiers vary in their methodologies. Moreover, the experiments show that the best classifier with the P-Stemmer was NB. This is an interesting result as this classifier is wellknown for its fast learning and classification time. Discussion: First, the continuous improvement of the P-Stemmer by more optimization steps is necessary to further improve the Arabic text categorization. This can be made by combining more classifiers with the stemmer, by optimizing the other natural language processing steps, and by improving the set of stemming rules. Second, the lack of sufficient Arabic datasets, especially large ones, is still an issue. Conclusion: In this work, an improved P-Stemmer was proposed by combining its use with various classifiers. In order to evaluate its performance, and due to the lack of Arabic datasets, a novel Arabic dataset was synthesized from various online news pages. Next, the P-Stemmer was combined with Naïve Bayes, Random Forest, Support Vector Machines, KNearest Neighbor, and K-Star.


2017 ◽  
Vol 7 (1) ◽  
pp. 32-46 ◽  
Author(s):  
Nafaa Haffar ◽  
Mohsen Maraoui ◽  
Shadi Aljawarneh ◽  
Mohammed Bouhorma ◽  
Abdallah Altahan Alnuaimi ◽  
...  

The Cloud E-Learning Systems for the Arabic language are relevant environments in many areas of training (teaching Arabic language) but also pose problems related to their creation tedious, costly in resources and time, and problems related to the search for information because of the increasing amount of information available and because of the methods of indexing, which is based on static methods such as keyword search that makes irrelevant the research process. For this, a new method of indexation is required. In this paper, a new Arabic text is proposed indexing approach using the creation of a new application profile of the LOM metadata schema (Learning Object Metadata) for the Arabic language. This profile includes the fields of LOM standard, and adds new fields for specific search information to Arabic language, and meets the needs of a teacher. Also, it's all using natural language processing tools like SAPA and AL-KHALIL.


Author(s):  
Ahed M. F. Al Sbou

<span>There is a huge content of Arabic text available over online that requires an organization of these texts. As result, here are many applications of natural languages processing (NLP) that concerns with text organization. One of the is text classification (TC). TC helps to make dealing with unorganized text. However, it is easier to classify them into suitable class or labels. This paper is a survey of Arabic text classification. Also, it presents comparison among different methods in the classification of Arabic texts, where Arabic text is represented a complex text due to its vocabularies. Arabic language is one of the richest languages in the world, where it has many linguistic bases. The research in Arabic language processing is very few compared to English. As a result, these problems represent challenges in the classification, and organization of specific Arabic text. Text classification (TC) helps to access the most documents, or information that has already classified into specific classes, or categories to one or more classes or categories. In addition, classification of documents facilitate search engine to decrease the amount of document to, and then to become easier to search and matching with queries.</span>


2010 ◽  
Vol 12 (1-2) ◽  
pp. 337-314
Author(s):  
ʿAbd Allāh Muḥammad al-Shāmī

The question of clarifying the meaning of a given Arabic text is a subtle one, especially as high literature texts can often be read in more than one way. Arabic is rich in figurative language and this can lead to variety in meaning, sometimes in ways that either adhere closely or diverge far from the ‘original’ meaning. In order to understand a fine literary text in Arabic, one must have a comprehensive understanding of the issue of taʾwīl, and the concept that multiplicity of meaning does not necessarily lead to contradiction. This article surveys the opinions of various literary critics and scholars of balāgha on this issue with a brief discussion of the concepts of tafsīr and sharḥ, which sometimes overlap with taʾwīl.


2021 ◽  
Vol 11 (15) ◽  
pp. 6851
Author(s):  
Reema Thabit ◽  
Nur Izura Udzir ◽  
Sharifah Md Yasin ◽  
Aziah Asmawi ◽  
Nuur Alifah Roslan ◽  
...  

Protecting sensitive information transmitted via public channels is a significant issue faced by governments, militaries, organizations, and individuals. Steganography protects the secret information by concealing it in a transferred object such as video, audio, image, text, network, or DNA. As text uses low bandwidth, it is commonly used by Internet users in their daily activities, resulting a vast amount of text messages sent daily as social media posts and documents. Accordingly, text is the ideal object to be used in steganography, since hiding a secret message in a text makes it difficult for the attacker to detect the hidden message among the massive text content on the Internet. Language’s characteristics are utilized in text steganography. Despite the richness of the Arabic language in linguistic characteristics, only a few studies have been conducted in Arabic text steganography. To draw further attention to Arabic text steganography prospects, this paper reviews the classifications of these methods from its inception. For analysis, this paper presents a comprehensive study based on the key evaluation criteria (i.e., capacity, invisibility, robustness, and security). It opens new areas for further research based on the trends in this field.


2014 ◽  
Vol 40 (2) ◽  
pp. 469-510 ◽  
Author(s):  
Khaled Shaalan

As more and more Arabic textual information becomes available through the Web in homes and businesses, via Internet and Intranet services, there is an urgent need for technologies and tools to process the relevant information. Named Entity Recognition (NER) is an Information Extraction task that has become an integral part of many other Natural Language Processing (NLP) tasks, such as Machine Translation and Information Retrieval. Arabic NER has begun to receive attention in recent years. The characteristics and peculiarities of Arabic, a member of the Semitic languages family, make dealing with NER a challenge. The performance of an Arabic NER component affects the overall performance of the NLP system in a positive manner. This article attempts to describe and detail the recent increase in interest and progress made in Arabic NER research. The importance of the NER task is demonstrated, the main characteristics of the Arabic language are highlighted, and the aspects of standardization in annotating named entities are illustrated. Moreover, the different Arabic linguistic resources are presented and the approaches used in Arabic NER field are explained. The features of common tools used in Arabic NER are described, and standard evaluation metrics are illustrated. In addition, a review of the state of the art of Arabic NER research is discussed. Finally, we present our conclusions. Throughout the presentation, illustrative examples are used for clarification.


Sign in / Sign up

Export Citation Format

Share Document