An Instance of Constructing Conceptual Model of Statistical Machine Translation for Processing Natural Language

2014 ◽  
Vol 926-930 ◽  
pp. 3645-3648
Author(s):  
Yan Di

The basic frame of Example based machine translation is concerned in this paper. Some key issues, such as bilingual alignment, similarity measure between input sentence and example, and template acquisition, are introduced.

Author(s):  
Zakaria El Maazouzi ◽  
Badr Eddine EL Mohajir ◽  
Mohammed Al Achhab

Achieving high accuracy in automatic translation tasks has been one of the challenging goals for researchers in the area of machine translation since decades. Thus, the eagerness of exploring new possible ways to improve machine translation was always the matter for researchers in the field. Automatic translation as a key application in the natural language processing domain has developed many approaches, namely statistical machine translation and recently neural machine translation that improved largely the translation quality especially for Latin languages. They have even made it possible for the translation of some language pairs to approach human translation quality. In this paper, we present a survey of the state of the art of statistical translation, where we describe the different existing methodologies, and we overview the recent research studies while pointing out the main strengths and limitations of the different approaches.  


Author(s):  
Karan Aggarwal ◽  
Mohammad Salameh ◽  
Abram Hindle

In this paper, we have tried to use statistical machine translation in order to convert Python 2 code to Python 3 code. We use data from two projects and achieve a high BLEU score. We also investigate the cross-project training and testing to analyze the errors so as to ascertain differences with previous case. We have described a pilot study on modeling programming languages as natural language to build translation models on the lines of natural languages. This can be further worked on to translate between versions of a programming language or cross-programming-languages code translation.


2020 ◽  
Vol 34 (05) ◽  
pp. 9258-9265
Author(s):  
Rongxiang Weng ◽  
Haoran Wei ◽  
Shujian Huang ◽  
Heng Yu ◽  
Lidong Bing ◽  
...  

Transformer, based on the encoder-decoder framework, has achieved state-of-the-art performance on several natural language generation tasks. The encoder maps the words in the input sentence into a sequence of hidden states, which are then fed into the decoder to generate the output sentence. These hidden states usually correspond to the input words and focus on capturing local information. However, the global (sentence level) information is seldom explored, leaving room for the improvement of generation quality. In this paper, we propose a novel global representation enhanced Transformer (GRET) to explicitly model global representation in the Transformer network. Specifically, in the proposed model, an external state is generated for the global representation from the encoder. The global representation is then fused into the decoder during the decoding process to improve generation quality. We conduct experiments in two text generation tasks: machine translation and text summarization. Experimental results on four WMT machine translation tasks and LCSTS text summarization task demonstrate the effectiveness of the proposed approach on natural language generation1.


2015 ◽  
Author(s):  
Karan Aggarwal ◽  
Mohammad Salameh ◽  
Abram Hindle

In this paper, we have tried to use statistical machine translation in order to convert Python 2 code to Python 3 code. We use data from two projects and achieve a high BLEU score. We also investigate the cross-project training and testing to analyze the errors so as to ascertain differences with previous case. We have described a pilot study on modeling programming languages as natural language to build translation models on the lines of natural languages. This can be further worked on to translate between versions of a programming language or cross-programming-languages code translation.


2013 ◽  
Vol 347-350 ◽  
pp. 3262-3266
Author(s):  
Ai Ling Wang

Machine translation (MT) is one of the core application of natural language processing and an important branch of artificial intelligence research; statistical methods have already become the mainstream of machine translation. This paper explores the comparative analysis on the translation model of statistical natural language processing based on the large-scale corpus; discusses word-based, phrase-based and syntax-based machine translation methods respectively, summarizes the evaluation factors of machine translation and analyzes evaluation methods of machine translation.


Author(s):  
Arun Babhulgaonkar ◽  
Shefali Sonavane

Hindi is the national language of India. However, most of the Government records, resolutions, news, etc. are documented in English which remote villagers may not understand. This fact motivates to develop an automatic language translation system from English to Hindi. Machine translation is the process of translating a text in one natural language into another natural language using computer system. Grammatical structure of Hindi language is very much complex than English language. The structural difference between English and Hindi language makes it difficult to achieve good quality translation results. In this paper, phrase-based statistical machine translation approach (PBSMT) is used for translation. Translation, reordering and language model are main working components of a PBSMT system. This paper evaluates the impact of various combinations of these PBSMT system parameters on automated English to Hindi language translation quality. Freely available n-gram-based BLEU metric and TER metric are used for evaluating the results.


Author(s):  
Syed Abdul Basit Andrabi, Et. al.

Machine translation is an application of natural language processing. Humans use native languages to communicate with one another, whereas programming languages communicate between humans and computers. NLP is the field that involves a broad set of techniques for analysis, manipulation and automatic generation of human languages or natural languages with the help of computers. It is essential to provide access to information to people for their development in the present information age. It is necessary to put equal emphasis on removing the barrier of language between different divisions of society. The area of NLP strives to fill this gap of the language barrier by applying machine translation. One natural language is transformed into another natural language with the aid of computers. The first few years of this area were dedicated to the development of rule-based systems. Still, later on, due to the increase in computational power, there was a transition towards statistical machine translation. The motive of machine translation is that the meaning of the translated text should be preserved during translation. This research paper aims to analyse the machine translation approaches used for resource-poor languages and determine the needs and challenges the researchers face. This paper also reviews the machine translation systems that are available for poor research languages.


2012 ◽  
Vol 98 (1) ◽  
pp. 5-24 ◽  
Author(s):  
Juan Pino ◽  
Aurelien Waite ◽  
William Byrne

Simple and Efficient Model Filtering in Statistical Machine Translation Data availability and distributed computing techniques have allowed statistical machine translation (SMT) researchers to build larger models. However, decoders need to be able to retrieve information efficiently from these models to be able to translate an input sentence or a set of input sentences. We introduce an easy to implement and general purpose solution to tackle this problem: we store SMT models as a set of key-value pairs in an HFile. We apply this strategy to two specific tasks: test set hierarchical phrase-based rule filtering and n-gram count filtering for language model lattice rescoring. We compare our approach to alternative strategies and show that its trade offs in terms of speed, memory and simplicity are competitive.


2018 ◽  
Vol 5 (1) ◽  
pp. 37-45
Author(s):  
Darryl Yunus Sulistyan

Machine Translation is a machine that is going to automatically translate given sentences in a language to other particular language. This paper aims to test the effectiveness of a new model of machine translation which is factored machine translation. We compare the performance of the unfactored system as our baseline compared to the factored model in terms of BLEU score. We test the model in German-English language pair using Europarl corpus. The tools we are using is called MOSES. It is freely downloadable and use. We found, however, that the unfactored model scored over 24 in BLEU and outperforms the factored model which scored below 24 in BLEU for all cases. In terms of words being translated, however, all of factored models outperforms the unfactored model.


Sign in / Sign up

Export Citation Format

Share Document