An Instance of Constructing Conceptual Model of Statistical Machine Translation for Processing Natural Language

Achieving high accuracy in automatic translation tasks has been one of the challenging goals for researchers in the area of machine translation since decades. Thus, the eagerness of exploring new possible ways to improve machine translation was always the matter for researchers in the field. Automatic translation as a key application in the natural language processing domain has developed many approaches, namely statistical machine translation and recently neural machine translation that improved largely the translation quality especially for Latin languages. They have even made it possible for the translation of some language pairs to approach human translation quality. In this paper, we present a survey of the state of the art of statistical translation, where we describe the different existing methodologies, and we overview the recent research studies while pointing out the main strengths and limitations of the different approaches.

Download Full-text

Using machine translation for converting Python 2 to Python 3 code

10.7287/peerj.preprints.1459 ◽

2015 ◽

Cited By ~ 1

Author(s):

Karan Aggarwal ◽

Mohammad Salameh ◽

Abram Hindle

Keyword(s):

Pilot Study ◽

Natural Language ◽

Programming Languages ◽

Machine Translation ◽

Programming Language ◽

Statistical Machine Translation ◽

Natural Languages ◽

The Cross ◽

Cross Project

In this paper, we have tried to use statistical machine translation in order to convert Python 2 code to Python 3 code. We use data from two projects and achieve a high BLEU score. We also investigate the cross-project training and testing to analyze the errors so as to ascertain differences with previous case. We have described a pilot study on modeling programming languages as natural language to build translation models on the lines of natural languages. This can be further worked on to translate between versions of a programming language or cross-programming-languages code translation.

Download Full-text

GRET: Global Representation Enhanced Transformer

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6464 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9258-9265

Author(s):

Rongxiang Weng ◽

Haoran Wei ◽

Shujian Huang ◽

Heng Yu ◽

Lidong Bing ◽

...

Keyword(s):

Natural Language ◽

Machine Translation ◽

Text Summarization ◽

Language Generation ◽

Sentence Level ◽

Proposed Model ◽

Level Information ◽

Global Representation ◽

Input Sentence ◽

Hidden States

Transformer, based on the encoder-decoder framework, has achieved state-of-the-art performance on several natural language generation tasks. The encoder maps the words in the input sentence into a sequence of hidden states, which are then fed into the decoder to generate the output sentence. These hidden states usually correspond to the input words and focus on capturing local information. However, the global (sentence level) information is seldom explored, leaving room for the improvement of generation quality. In this paper, we propose a novel global representation enhanced Transformer (GRET) to explicitly model global representation in the Transformer network. Specifically, in the proposed model, an external state is generated for the global representation from the encoder. The global representation is then fused into the decoder during the decoding process to improve generation quality. We conduct experiments in two text generation tasks: machine translation and text summarization. Experimental results on four WMT machine translation tasks and LCSTS text summarization task demonstrate the effectiveness of the proposed approach on natural language generation1.

Download Full-text

Using machine translation for converting Python 2 to Python 3 code

10.7287/peerj.preprints.1459v1 ◽

2015 ◽

Author(s):

Karan Aggarwal ◽

Mohammad Salameh ◽

Abram Hindle

Keyword(s):

Pilot Study ◽

Natural Language ◽

Programming Languages ◽

Machine Translation ◽

Programming Language ◽

Statistical Machine Translation ◽

Natural Languages ◽

The Cross ◽

Cross Project

In this paper, we have tried to use statistical machine translation in order to convert Python 2 code to Python 3 code. We use data from two projects and achieve a high BLEU score. We also investigate the cross-project training and testing to analyze the errors so as to ascertain differences with previous case. We have described a pilot study on modeling programming languages as natural language to build translation models on the lines of natural languages. This can be further worked on to translate between versions of a programming language or cross-programming-languages code translation.

Download Full-text

On the Statistical Machine Translation Studies

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.347-350.3262 ◽

2013 ◽

Vol 347-350 ◽

pp. 3262-3266

Author(s):

Ai Ling Wang

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Large Scale ◽

Statistical Machine Translation ◽

Translation Model ◽

Statistical Natural Language Processing ◽

Translation Methods ◽

Important Branch

Machine translation (MT) is one of the core application of natural language processing and an important branch of artificial intelligence research; statistical methods have already become the mainstream of machine translation. This paper explores the comparative analysis on the translation model of statistical natural language processing based on the large-scale corpus; discusses word-based, phrase-based and syntax-based machine translation methods respectively, summarizes the evaluation factors of machine translation and analyzes evaluation methods of machine translation.

Download Full-text

Empirical Analysis of Phrase-Based Statistical Machine Translation System for English to Hindi Language

Vietnam Journal of Computer Science ◽

10.1142/s219688882250004x ◽

2021 ◽

pp. 1-28

Author(s):

Arun Babhulgaonkar ◽

Shefali Sonavane

Keyword(s):

Natural Language ◽

Machine Translation ◽

Language Translation ◽

Statistical Machine Translation ◽

Structural Difference ◽

National Language ◽

Translation System ◽

Available N ◽

Hindi Language ◽

The Impact

Hindi is the national language of India. However, most of the Government records, resolutions, news, etc. are documented in English which remote villagers may not understand. This fact motivates to develop an automatic language translation system from English to Hindi. Machine translation is the process of translating a text in one natural language into another natural language using computer system. Grammatical structure of Hindi language is very much complex than English language. The structural difference between English and Hindi language makes it difficult to achieve good quality translation results. In this paper, phrase-based statistical machine translation approach (PBSMT) is used for translation. Translation, reordering and language model are main working components of a PBSMT system. This paper evaluates the impact of various combinations of these PBSMT system parameters on automated English to Hindi language translation quality. Freely available n-gram-based BLEU metric and TER metric are used for evaluating the results.

Download Full-text

A Review of Machine Translation for South Asian Low Resource Languages

Turkish Journal of Computer and Mathematics Education (TURCOMAT) ◽

10.17762/turcomat.v12i5.1777 ◽

2021 ◽

Vol 12 (5) ◽

pp. 1134-1147

Author(s):

Syed Abdul Basit Andrabi, Et. al.

Keyword(s):

Natural Language ◽

Programming Languages ◽

Machine Translation ◽

Language Processing ◽

Statistical Machine Translation ◽

Automatic Generation ◽

Natural Languages ◽

Native Languages ◽

Present Information ◽

Translation Systems

Machine translation is an application of natural language processing. Humans use native languages to communicate with one another, whereas programming languages communicate between humans and computers. NLP is the field that involves a broad set of techniques for analysis, manipulation and automatic generation of human languages or natural languages with the help of computers. It is essential to provide access to information to people for their development in the present information age. It is necessary to put equal emphasis on removing the barrier of language between different divisions of society. The area of NLP strives to fill this gap of the language barrier by applying machine translation. One natural language is transformed into another natural language with the aid of computers. The first few years of this area were dedicated to the development of rule-based systems. Still, later on, due to the increase in computational power, there was a transition towards statistical machine translation. The motive of machine translation is that the meaning of the translated text should be preserved during translation. This research paper aims to analyse the machine translation approaches used for resource-poor languages and determine the needs and challenges the researchers face. This paper also reviews the machine translation systems that are available for poor research languages.

Download Full-text

Statistical approaches for natural language modelling and monotone statistical machine translation

10.4995/thesis/10251/7109 ◽

2010 ◽

Author(s):

Jesús Andrés Ferrer

Keyword(s):

Natural Language ◽

Machine Translation ◽

Statistical Machine Translation ◽

Language Modelling ◽

Statistical Approaches

Download Full-text

Simple and Efficient Model Filtering in Statistical Machine Translation

Prague Bulletin of Mathematical Linguistics ◽

10.2478/v10108-012-0005-x ◽

2012 ◽

Vol 98 (1) ◽

pp. 5-24 ◽

Cited By ~ 1

Author(s):

Juan Pino ◽

Aurelien Waite ◽

William Byrne

Keyword(s):

Machine Translation ◽

Statistical Machine Translation ◽

Language Model ◽

General Purpose ◽

Data Availability ◽

Trade Offs ◽

Input Sentence ◽

N Gram ◽

Model Lattice ◽

Lattice Rescoring

Simple and Efficient Model Filtering in Statistical Machine Translation Data availability and distributed computing techniques have allowed statistical machine translation (SMT) researchers to build larger models. However, decoders need to be able to retrieve information efficiently from these models to be able to translate an input sentence or a set of input sentences. We introduce an easy to implement and general purpose solution to tackle this problem: we store SMT models as a set of key-value pairs in an HFile. We apply this strategy to two specific tasks: test set hierarchical phrase-based rule filtering and n-gram count filtering for language model lattice rescoring. We compare our approach to alternative strategies and show that its trade offs in terms of speed, memory and simplicity are competitive.

Download Full-text

Factored Statistical Machine Translation for German-English

Journal of Applied Information, Communication and Technology ◽

10.33555/ejaict.v5i1.47 ◽

2018 ◽

Vol 5 (1) ◽

pp. 37-45

Author(s):

Darryl Yunus Sulistyan

Keyword(s):

Machine Translation ◽

English Language ◽

Statistical Machine Translation ◽

New Model ◽

Language Pair

Machine Translation is a machine that is going to automatically translate given sentences in a language to other particular language. This paper aims to test the effectiveness of a new model of machine translation which is factored machine translation. We compare the performance of the unfactored system as our baseline compared to the factored model in terms of BLEU score. We test the model in German-English language pair using Europarl corpus. The tools we are using is called MOSES. It is freely downloadable and use. We found, however, that the unfactored model scored over 24 in BLEU and outperforms the factored model which scored below 24 in BLEU for all cases. In terms of words being translated, however, all of factored models outperforms the unfactored model.

Download Full-text