scholarly journals A Transformer-Based Neural Machine Translation Model for Arabic Dialects That Utilizes Subword Units

Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6509
Author(s):  
Laith H. Baniata ◽  
Isaac. K. E. Ampomah ◽  
Seyoung Park

Languages that allow free word order, such as Arabic dialects, are of significant difficulty for neural machine translation (NMT) because of many scarce words and the inefficiency of NMT systems to translate these words. Unknown Word (UNK) tokens represent the out-of-vocabulary words for the reason that NMT systems run with vocabulary that has fixed size. Scarce words are encoded completely as sequences of subword pieces employing the Word-Piece Model. This research paper introduces the first Transformer-based neural machine translation model for Arabic vernaculars that employs subword units. The proposed solution is based on the Transformer model that has been presented lately. The use of subword units and shared vocabulary within the Arabic dialect (the source language) and modern standard Arabic (the target language) enhances the behavior of the multi-head attention sublayers for the encoder by obtaining the overall dependencies between words of input sentence for Arabic vernacular. Experiments are carried out from Levantine Arabic vernacular (LEV) to modern standard Arabic (MSA) and Maghrebi Arabic vernacular (MAG) to MSA, Gulf–MSA, Nile–MSA, Iraqi Arabic (IRQ) to MSA translation tasks. Extensive experiments confirm that the suggested model adequately addresses the unknown word issue and boosts the quality of translation from Arabic vernaculars to Modern standard Arabic (MSA).

2018 ◽  
Vol 8 (12) ◽  
pp. 2502 ◽  
Author(s):  
Laith H. Baniata ◽  
Seyoung Park ◽  
Seong-Bae Park

The statistical machine translation for the Arabic language integrates external linguistic resources such as part-of-speech tags. The current research presents a Bidirectional Long Short-Term Memory (Bi-LSTM) - Conditional Random Fields (CRF) segment-level Arabic Dialect POS tagger model, which will be integrated into the Multitask Neural Machine Translation (NMT) model. The proposed solution for NMT is based on the recurrent neural network encoder-decoder NMT model that has been introduced recently. The study has proposed and developed a unified Multitask NMT model that shares an encoder between the two tasks; Arabic Dialect (AD) to Modern Standard Arabic (MSA) translation task and the segment-level POS tagging tasks. A shared layer and an invariant layer are shared between the translation tasks. By training translation tasks and POS tagging task alternately, the proposed model can leverage the characteristic information and improve the translation quality from Arabic dialects to Modern Standard Arabic. The experiments are conducted from Levantine Arabic (LA) to MSA and Maghrebi Arabic (MA) to MSA translation tasks. As an additional linguistic resource, the segment-level part-of-speech tags for Arabic dialects were also exploited. Experiments suggest that translation quality and the performance of POS tagger were improved with the implementation of multitask learning approach.


2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Laith H. Baniata ◽  
Seyoung Park ◽  
Seong-Bae Park

In this research article, we study the problem of employing a neural machine translation model to translate Arabic dialects to Modern Standard Arabic. The proposed solution of the neural machine translation model is prompted by the recurrent neural network-based encoder-decoder neural machine translation model that has been proposed recently, which generalizes machine translation as sequence learning problems. We propose the development of a multitask learning (MTL) model which shares one decoder among language pairs, and every source language has a separate encoder. The proposed model can be applied to limited volumes of data as well as extensive amounts of data. Experiments carried out have shown that the proposed MTL model can ensure a higher quality of translation when compared to the individually learned model.


2021 ◽  
Author(s):  
El Moatez Billah Nagoudi ◽  
AbdelRahim Elmadany ◽  
Muhammad Abdul-Mageed

Author(s):  
Long Zhou ◽  
Jiajun Zhang ◽  
Chengqing Zong

Existing approaches to neural machine translation (NMT) generate the target language sequence token-by-token from left to right. However, this kind of unidirectional decoding framework cannot make full use of the target-side future contexts which can be produced in a right-to-left decoding direction, and thus suffers from the issue of unbalanced outputs. In this paper, we introduce a synchronous bidirectional–neural machine translation (SB-NMT) that predicts its outputs using left-to-right and right-to-left decoding simultaneously and interactively, in order to leverage both of the history and future information at the same time. Specifically, we first propose a new algorithm that enables synchronous bidirectional decoding in a single model. Then, we present an interactive decoding model in which left-to-right (right-to-left) generation does not only depend on its previously generated outputs, but also relies on future contexts predicted by right-to-left (left-to-right) decoding. We extensively evaluate the proposed SB-NMT model on large-scale NIST Chinese-English, WMT14 English-German, and WMT18 Russian-English translation tasks. Experimental results demonstrate that our model achieves significant improvements over the strong Transformer model by 3.92, 1.49, and 1.04 BLEU points, respectively, and obtains the state-of-the-art per- formance on Chinese-English and English- German translation tasks. 1


2013 ◽  
Vol 21 (3) ◽  
pp. 477-495 ◽  
Author(s):  
AQIL M. AZMI ◽  
REHAM S. ALMAJED

AbstractIn Modern Standard Arabic texts are typically written without diacritical markings. The diacritics are important to clarify the sense and meaning of words. Lack of these markings may lead to ambiguity even for the natives. Often the natives successfully disambiguate the meaning through the context; however, many Arabic applications, such as machine translation, text-to-speech, and information retrieval, are vulnerable due to lack of diacritics. The process of automatically restoring diacritical marks is called diacritization or diacritic restoration. In this paper we discuss the properties of the Arabic language and the issues that are related to the lack of the diacritical marking. It will be followed by a survey of the recent algorithms that were developed to solve the diacritization problem. We also look into the future trend for researchers working in this area.


Author(s):  
Isaac Kojo Essel Ampomah ◽  
Sally McClean ◽  
Glenn Hawe

AbstractSelf-attention-based encoder-decoder frameworks have drawn increasing attention in recent years. The self-attention mechanism generates contextual representations by attending to all tokens in the sentence. Despite improvements in performance, recent research argues that the self-attention mechanism tends to concentrate more on the global context with less emphasis on the contextual information available within the local neighbourhood of tokens. This work presents the Dual Contextual (DC) module, an extension of the conventional self-attention unit, to effectively leverage both the local and global contextual information. The goal is to further improve the sentence representation ability of the encoder and decoder subnetworks, thus enhancing the overall performance of the translation model. Experimental results on WMT’14 English-German (En$$\rightarrow $$ → De) and eight IWSLT translation tasks show that the DC module can further improve the translation performance of the Transformer model.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Thien Nguyen ◽  
Hoai Le ◽  
Van-Huy Pham

End-to-end neural machine translation does not require us to have specialized knowledge of investigated language pairs in building an effective system. On the other hand, feature engineering proves to be vital in other artificial intelligence fields, such as speech recognition and computer vision. Inspired by works in those fields, in this paper, we propose a novel feature-based translation model by modifying the state-of-the-art transformer model. Specifically, the encoder of the modified transformer model takes input combinations of linguistic features comprising of lemma, dependency label, part-of-speech tag, and morphological label instead of source words. The experiment results for the Russian-Vietnamese language pair show that the proposed feature-based transformer model improves over the strongest baseline transformer translation model by impressive 4.83 BLEU. In addition, experiment analysis reveals that human judgment on the translation results strongly confirms machine judgment. Our model could be useful in building translation systems translating from a highly inflectional language into a noninflectional language.


Sign in / Sign up

Export Citation Format

Share Document