Neural Machine Translation Using Attention Mechanism

Attention-based encoder-decoder models have superseded conventional techniques due to their unmatched performance on many neural machine translation problems. Usually, the encoders and decoders are two recurrent neural networks where the decoder is directed to focus on relevant parts of the source language using attention mechanism. This data-driven approach leads to generic and scalable solutions with no reliance on manual hand-crafted features. To the best of our knowledge, none of the modern machine translation approaches has been applied to address the research problem of Urdu machine transliteration. Ours is the first attempt to apply the deep neural network-based encoder-decoder using attention mechanism to address the aforementioned problem using Roman-Urdu and Urdu parallel corpus. To this end, we present (i) the first ever Roman-Urdu to Urdu parallel corpus of 1.1 million sentences, (ii) three state of the art encoder-decoder models, and (iii) a detailed empirical analysis of these three models on the Roman-Urdu to Urdu parallel corpus. Overall, attention-based model gives state-of-the-art performance with the benchmark of 70 BLEU score. Our qualitative experimental evaluation shows that our models generate coherent transliterations which are grammatically and logically correct.

Download Full-text

Analyses and Modeling of Neural Machine Translation for English-to-Khasi

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b3175.079220 ◽

2020 ◽

Vol 9 (2) ◽

pp. 115-118

Keyword(s):

Deep Learning ◽

Machine Translation ◽

Statistical Machine Translation ◽

Language Barrier ◽

Attention Mechanism ◽

Training Dataset ◽

Learning Approach ◽

Neural Machine Translation ◽

Community Or ◽

North East

Language barrier is a common issue faced by humans who move from one community or group to another. Statistical machine translation has enabled us to solve this issue to a certain extent, by formulating models to translate text from one language to another. Statistical machine translation has come a long way but they have their limitations in terms of translating words that belongs to an entirely different context that is not available in the training dataset. This has paved way for neural Machine Translation (NMT), a deep learning approach in solving sequence to sequence translation. Khasi is a language popularly spoken in Meghalaya, a north-east state in India. Its wide and unexplored. In this paper we will discuss about the modeling and analyzing of a NMT base model and a NMT model using Attention mechanism for English to Khasi.

Download Full-text

Guiding Attention in Sequence-to-Sequence Models for Dialogue Act Prediction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6259 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7594-7601

Author(s):

Pierre Colombo ◽

Emile Chapuis ◽

Matteo Manica ◽

Emmanuel Vignon ◽

Giovanna Varni ◽

...

Keyword(s):

Machine Translation ◽

Random Fields ◽

Conditional Random Fields ◽

State Of The Art ◽

The State ◽

Attention Mechanism ◽

Accuracy Score ◽

Beam Search ◽

Conversational Agents ◽

Neural Machine Translation

The task of predicting dialog acts (DA) based on conversational dialog is a key component in the development of conversational agents. Accurately predicting DAs requires a precise modeling of both the conversation and the global tag dependencies. We leverage seq2seq approaches widely adopted in Neural Machine Translation (NMT) to improve the modelling of tag sequentiality. Seq2seq models are known to learn complex global dependencies while currently proposed approaches using linear conditional random fields (CRF) only model local tag dependencies. In this work, we introduce a seq2seq model tailored for DA classification using: a hierarchical encoder, a novel guided attention mechanism and beam search applied to both training and inference. Compared to the state of the art our model does not require handcrafted features and is trained end-to-end. Furthermore, the proposed approach achieves an unmatched accuracy score of 85% on SwDA, and state-of-the-art accuracy score of 91.6% on MRDA.

Download Full-text

PhraseAttn: Dynamic Slot Capsule Networks for phrase representation in Neural Machine Translation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-212101 ◽

2021 ◽

pp. 1-8

Author(s):

Binh Nguyen ◽

Binh Le ◽

Long H.B. Nguyen ◽

Dien Dinh

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Vital Role ◽

Attention Mechanism ◽

Neural Machine Translation ◽

Translation Model ◽

Word Representation

Word representation plays a vital role in most Natural Language Processing systems, especially for Neural Machine Translation. It tends to capture semantic and similarity between individual words well, but struggle to represent the meaning of phrases or multi-word expressions. In this paper, we investigate a method to generate and use phrase information in a translation model. To generate phrase representations, a Primary Phrase Capsule network is first employed, then iteratively enhancing with a Slot Attention mechanism. Experiments on the IWSLT English to Vietnamese, French, and German datasets show that our proposed method consistently outperforms the baseline Transformer, and attains competitive results over the scaled Transformer with two times lower parameters.

Download Full-text

English to Bengali Neural Machine Translation Using Global Attention Mechanism

Emerging Technologies in Data Mining and Information Security - Lecture Notes in Networks and Systems ◽

10.1007/978-981-15-9774-9_35 ◽

2021 ◽

pp. 359-369

Author(s):

Sheikh Abujar ◽

Abu Kaisar Mohammad Masum ◽

Abhishek Bhattacharya ◽

Soumi Dutta ◽

Syed Akhter Hossain

Keyword(s):

Machine Translation ◽

Attention Mechanism ◽

Neural Machine Translation

Download Full-text

A comparative study of neural machine translation models for Turkish language

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-211453 ◽

2021 ◽

pp. 1-11

Author(s):

Özgür Özdemir ◽

Emre Salih Akın ◽

Rıza Velioğlu ◽

Tuğba Dalyan

Keyword(s):

Machine Translation ◽

Computational Linguistics ◽

Attention Mechanism ◽

Neural Machine Translation ◽

English Translations ◽

Benchmark Datasets ◽

Important Challenge ◽

Morphologically Rich Languages ◽

Transformer Model ◽

Ted Talks

Machine translation (MT) is an important challenge in the fields of Computational Linguistics. In this study, we conducted neural machine translation (NMT) experiments on two different architectures. First, Sequence to Sequence (Seq2Seq) architecture along with a variation that utilizes attention mechanism is performed on translation task. Second, an architecture that is fully based on the self-attention mechanism, namely Transformer, is employed to perform a comprehensive comparison. Besides, the contribution of employing Byte Pair Encoding (BPE) and Gumbel Softmax distributions are examined for both architectures. The experiments are conducted on two different datasets: TED Talks that is one of the popular benchmark datasets for NMT especially among morphologically rich languages like Turkish and WMT18 News dataset that is provided by The Third Conference on Machine Translation (WMT) for shared tasks on various aspects of machine translation. The evaluation of Turkish-to-English translations’ results demonstrate that the Transformer model with combination of BPE and Gumbel Softmax achieved 22.4 BLEU score on TED Talks and 38.7 BLUE score on WMT18 News dataset. The empirical results support that using Gumbel Softmax distribution improves the quality of translations for both architectures.

Download Full-text

Tibetan-Chinese Neural Machine Translation Combining Attention Mechanism

Journal of Physics Conference Series ◽

10.1088/1742-6596/1607/1/012001 ◽

2020 ◽

Vol 1607 ◽

pp. 012001

Author(s):

Tao Jiang ◽

Hao Sun ◽

Yu Gang Dai ◽

Ding Liu

Keyword(s):

Machine Translation ◽

Attention Mechanism ◽

Neural Machine Translation

Download Full-text

Dual contextual module for neural machine translation

Machine Translation ◽

10.1007/s10590-021-09282-0 ◽

2021 ◽

Author(s):

Isaac Kojo Essel Ampomah ◽

Sally McClean ◽

Glenn Hawe

Keyword(s):

Machine Translation ◽

Contextual Information ◽

Attention Mechanism ◽

The Self ◽

Experimental Results ◽

Neural Machine Translation ◽

Translation Model ◽

Global Context ◽

Overall Performance ◽

Transformer Model

AbstractSelf-attention-based encoder-decoder frameworks have drawn increasing attention in recent years. The self-attention mechanism generates contextual representations by attending to all tokens in the sentence. Despite improvements in performance, recent research argues that the self-attention mechanism tends to concentrate more on the global context with less emphasis on the contextual information available within the local neighbourhood of tokens. This work presents the Dual Contextual (DC) module, an extension of the conventional self-attention unit, to effectively leverage both the local and global contextual information. The goal is to further improve the sentence representation ability of the encoder and decoder subnetworks, thus enhancing the overall performance of the translation model. Experimental results on WMT’14 English-German (En$$\rightarrow $$ → De) and eight IWSLT translation tasks show that the DC module can further improve the translation performance of the Transformer model.

Download Full-text

Implementing neural machine translation with bi-directional GRU and attention mechanism on FPGAs using HLS

Proceedings of the 24th Asia and South Pacific Design Automation Conference on - ASPDAC '19 ◽

10.1145/3287624.3287717 ◽

2019 ◽

Cited By ~ 7

Author(s):

Qin Li ◽

Xiaofan Zhang ◽

JinJun Xiong ◽

Wen-mei Hwu ◽

Deming Chen

Keyword(s):

Machine Translation ◽

Attention Mechanism ◽

Neural Machine Translation

Download Full-text

Generating Alignments Using Target Foresight in Attention-Based Neural Machine Translation

Prague Bulletin of Mathematical Linguistics ◽

10.1515/pralin-2017-0006 ◽

2017 ◽

Vol 108 (1) ◽

pp. 27-36

Author(s):

Jan-Thorsten Peter ◽

Arne Nix ◽

Hermann Ney

Keyword(s):

Target Word ◽

Machine Translation ◽

Explicit Knowledge ◽

Attention Mechanism ◽

Target Information ◽

High Quality ◽

Neural Machine Translation ◽

Translation Quality ◽

Successful Approach

AbstractNeural machine translation (NMT) has shown large improvements in recent years. The currently most successful approach in this area relies on the attention mechanism, which is often interpreted as an alignment, even though it is computed without explicit knowledge of the target word. This limitation is the most likely reason that the quality of attention-based alignments is inferior to the quality of traditional alignment methods. Guided alignment training has shown that alignments are still capable of improving translation quality. In this work, we propose an extension of the attention-based NMT model that introduces target information into the attention mechanism to produce high-quality alignments. In comparison to the conventional attention-based alignments, our model halves the Aer with an absolute improvement of 19.1% Aer. Compared to GIZA++ it shows an absolute improvement of 2.0% Aer.

Download Full-text