scholarly journals Using machine translation for converting Python 2 to Python 3 code

Author(s):  
Karan Aggarwal ◽  
Mohammad Salameh ◽  
Abram Hindle

In this paper, we have tried to use statistical machine translation in order to convert Python 2 code to Python 3 code. We use data from two projects and achieve a high BLEU score. We also investigate the cross-project training and testing to analyze the errors so as to ascertain differences with previous case. We have described a pilot study on modeling programming languages as natural language to build translation models on the lines of natural languages. This can be further worked on to translate between versions of a programming language or cross-programming-languages code translation.

2015 ◽  
Author(s):  
Karan Aggarwal ◽  
Mohammad Salameh ◽  
Abram Hindle

In this paper, we have tried to use statistical machine translation in order to convert Python 2 code to Python 3 code. We use data from two projects and achieve a high BLEU score. We also investigate the cross-project training and testing to analyze the errors so as to ascertain differences with previous case. We have described a pilot study on modeling programming languages as natural language to build translation models on the lines of natural languages. This can be further worked on to translate between versions of a programming language or cross-programming-languages code translation.


Author(s):  
Syed Abdul Basit Andrabi, Et. al.

Machine translation is an application of natural language processing. Humans use native languages to communicate with one another, whereas programming languages communicate between humans and computers. NLP is the field that involves a broad set of techniques for analysis, manipulation and automatic generation of human languages or natural languages with the help of computers. It is essential to provide access to information to people for their development in the present information age. It is necessary to put equal emphasis on removing the barrier of language between different divisions of society. The area of NLP strives to fill this gap of the language barrier by applying machine translation. One natural language is transformed into another natural language with the aid of computers. The first few years of this area were dedicated to the development of rule-based systems. Still, later on, due to the increase in computational power, there was a transition towards statistical machine translation. The motive of machine translation is that the meaning of the translated text should be preserved during translation. This research paper aims to analyse the machine translation approaches used for resource-poor languages and determine the needs and challenges the researchers face. This paper also reviews the machine translation systems that are available for poor research languages.


2014 ◽  
Vol 926-930 ◽  
pp. 3645-3648
Author(s):  
Yan Di

The basic frame of Example based machine translation is concerned in this paper. Some key issues, such as bilingual alignment, similarity measure between input sentence and example, and template acquisition, are introduced.


Author(s):  
Zakaria El Maazouzi ◽  
Badr Eddine EL Mohajir ◽  
Mohammed Al Achhab

Achieving high accuracy in automatic translation tasks has been one of the challenging goals for researchers in the area of machine translation since decades. Thus, the eagerness of exploring new possible ways to improve machine translation was always the matter for researchers in the field. Automatic translation as a key application in the natural language processing domain has developed many approaches, namely statistical machine translation and recently neural machine translation that improved largely the translation quality especially for Latin languages. They have even made it possible for the translation of some language pairs to approach human translation quality. In this paper, we present a survey of the state of the art of statistical translation, where we describe the different existing methodologies, and we overview the recent research studies while pointing out the main strengths and limitations of the different approaches.  


2019 ◽  
Author(s):  
James Grimmelmann

Smart contracts are written in programming languages rather than in natural languages. This might seem to insulate them from ambiguity, because the meaning of a program is determined by technical facts rather than by social ones. It does not. Smart contracts can be ambiguous, too, because technical facts depend on socially determined ones. To give meaning to a computer program, a community of programmers and users must agree on the semantics of the programming language in which it is written. This is a social process, and a review of some famous controversies involving blockchains and smart contracts shows that it regularly creates serious ambiguities. In the most famous case, The DAO hack, more than $150 million in virtual currency turned on the contested semantics of a blockchain-based smart-contract programming language.


2019 ◽  
Vol 9 (18) ◽  
pp. 3936 ◽  
Author(s):  
Oscar Pulido-Prieto ◽  
Ulises Juárez-Martínez

While the use of natural language for software development has been proposed since the 1960s, it was limited by the inherent ambiguity of natural languages, which people resolve using reasoning in a text or conversation. Programming languages are formal general-purpose or domain-specific alternatives based on mathematical formalism and which are at a remove from natural language. Over the years, various authors have presented studies in which they attempted to use a subset of the English language for solving particular problems. Each author approached the problem by covering particular domains, rather than focusing on describing general elements that would help other authors develop general-purpose languages, instead focusing even more on domain-specific languages. The identification of common elements in these studies reveals characteristics that enable the design and implementation of general-purpose naturalistic languages, which requires the establishment of a programming model. This article presents a conceptual model which describes the elements required for designing general-purpose programming languages and which integrates abstraction, temporal elements and indirect references into its grammar. Moreover, as its grammar resembles natural language, thus reducing the gap between problem and solution domains, a naturalistic language prototype is presented, as are three test scenarios which demonstrate its characteristics.


2021 ◽  
Vol 15 (5) ◽  
pp. 1-21
Author(s):  
Xiang Ling ◽  
Lingfei Wu ◽  
Saizhuo Wang ◽  
Gaoning Pan ◽  
Tengfei Ma ◽  
...  

Code retrieval is to find the code snippet from a large corpus of source code repositories that highly matches the query of natural language description. Recent work mainly uses natural language processing techniques to process both query texts (i.e., human natural language) and code snippets (i.e., machine programming language), however, neglecting the deep structured features of query texts and source codes, both of which contain rich semantic information. In this article, we propose an end-to-end deep graph matching and searching (DGMS) model based on graph neural networks for the task of semantic code retrieval. To this end, we first represent both natural language query texts and programming language code snippets with the unified graph-structured data, and then use the proposed graph matching and searching model to retrieve the best matching code snippet. In particular, DGMS not only captures more structural information for individual query texts or code snippets, but also learns the fine-grained similarity between them by cross-attention based semantic matching operations. We evaluate the proposed DGMS model on two public code retrieval datasets with two representative programming languages (i.e., Java and Python). Experiment results demonstrate that DGMS significantly outperforms state-of-the-art baseline models by a large margin on both datasets. Moreover, our extensive ablation studies systematically investigate and illustrate the impact of each part of DGMS.


2013 ◽  
Vol 347-350 ◽  
pp. 3262-3266
Author(s):  
Ai Ling Wang

Machine translation (MT) is one of the core application of natural language processing and an important branch of artificial intelligence research; statistical methods have already become the mainstream of machine translation. This paper explores the comparative analysis on the translation model of statistical natural language processing based on the large-scale corpus; discusses word-based, phrase-based and syntax-based machine translation methods respectively, summarizes the evaluation factors of machine translation and analyzes evaluation methods of machine translation.


2013 ◽  
Vol 8 (3) ◽  
pp. 908-912 ◽  
Author(s):  
Sumita Rani ◽  
Dr. Vijay Luxmi

Machine Translation System is an important area in Natural Language Processing. The Direct MT system is based upon the utilization of syntactic and vocabulary similarities between more or few related natural languages. The relation between two or more languages is based upon their common parent language. The similarity between Punjabi and Hindi languages is due to their parent language Sanskrit. Punjabi and Hindi are closely related languages with lots of similarities in syntax and vocabulary. In the present paper, Direct Machine Translation System from Punjabi to Hindi has been developed and its output is evaluated in order to get the suitability of the system.


Author(s):  
Arun Babhulgaonkar ◽  
Shefali Sonavane

Hindi is the national language of India. However, most of the Government records, resolutions, news, etc. are documented in English which remote villagers may not understand. This fact motivates to develop an automatic language translation system from English to Hindi. Machine translation is the process of translating a text in one natural language into another natural language using computer system. Grammatical structure of Hindi language is very much complex than English language. The structural difference between English and Hindi language makes it difficult to achieve good quality translation results. In this paper, phrase-based statistical machine translation approach (PBSMT) is used for translation. Translation, reordering and language model are main working components of a PBSMT system. This paper evaluates the impact of various combinations of these PBSMT system parameters on automated English to Hindi language translation quality. Freely available n-gram-based BLEU metric and TER metric are used for evaluating the results.


Sign in / Sign up

Export Citation Format

Share Document