scholarly journals A System for Machine Translation from English to Ebira using the Rule-based Approach

Author(s):  
Tijani Musari Abdulmusawir ◽  
Sani Felix Ayegba ◽  
Yahaya Musa Kayode ◽  
Eze Christian Chinemerem

This research work is aimed at bridging the knowledge gap between the most popular knowledge rich English language and the minority Ebira language spoken by the Ebira people, a minority ethnic group in part of Nigeria. Across the globe and on the internet, English language has become the most widely used language for knowledge dissemination. And presently, the majority of the indigenous people of Ebiral and also known as “Anebira” are still not proficient in their use of English language which as a result prevents them from gaining full knowledge disseminated in English language. Hence, the need to develop an automated Machine Translation System capable of translating English text to Ebira text which will help the people to tap from the abundant knowledge conveyed in English language for effective and fast development in their social, political, scientific, philosophical and economic areas of life. The system was designed to consolidate on human translators’ effort and not to replace them. A comprehensive study and analysis of the two languages was carried out with the help of Ebira native speakers in Ebiraland Kogi central and some professional English language tutors at FCE Okene. The knowledge gathered provided the basis for the design and testing of the rule base, inference engine, bilingual dictionary which are important components for the proposed automated system for translation of English text to Ebira text using PHP. Making use of the word in the bilingual dictionary, the system will successfully translate your English text to Ebira. The system was evaluated using one of the popular automatic method of evaluating MT systems BLEU (Bilingual Evaluation Understudy). And an accuracy of 81.5% in translation was achieved. An improved system in the future is recommended to accommodate more complex sentences for the more benefit of the good people of Enebira.

Author(s):  
Maimitili Nimaiti ◽  
Yamamoto Izumi

Japanese Uyghur machine translation system has been designed and developed using recent rule based approach. Even though Japanese and Uyghur language has many similarities, but there are also some linguistic differences cause serious problems to the word for word translation. In fact, as straightforward word-for-word Japanese-Uighur translation sometimes yields unnatural Uighur sentences. To raise the translation accuracy, the authors propose a word-for-word translation system using subject verb agreement in Uighur. After a brief introduction to the comparative study of Japanese-Uyghur grammars, morphology and syntax, the authors explain their developing of a word to word rule base system. The coverage of this rule base system, the rules for translation, comparison of experimental result between statistical machine translation system and rule base machine translation system are explained. Some practical suffix translation methods solving problems in Uyghur language are also proposed.


The hearing challenged community all over world face difficulties to communicate with others. Machine translation has been one of the prominent technologies to facilitate a two way communication to the deaf and hard of hearing community all over the world. We have explored and formulated the fundamental rules of Indian Sign Language and implemented as a translation mechanism of English Text to Indian sign Language glosses. The structure of the source text is identified and transferred to the target language according to the formulated rules and sub rules. The intermediate phases of the transfer process is also mentioned in this research work.


Machine Translation (MT) is a technique that automatically translates text from one natural language to another using machine like computer. Machine Transliteration (MTn) is also a technique that converts the script of text from source language to target language without changing the pronunciation of the source text. Both the MT and MTn are the challenging research task in the field of Natural Language Processing (NLP) and Computational Linguistics (CL) globally. English is a high resource natural language, whereas Bodo is a low resource natural language. Though Bodo is a recognized language of India; still not much research work has been done on MT and MTn systems due to the low resources. The primary objective of this paper is to develop Bodo to English Machine Translation system with the help of Bodo to English Machine Transliteration system. The Bodo to English MT system has been developed using the Phrase-based Statistical Machine Translation technique for General and News domains of Bodo-English parallel text corpus. The Bodo to English MTn system has been developed using the Hybrid technique for General and News domains of Bodo-English parallel transliterated words/terms. The translation accuracy of the MT system has been evaluated using BLEU technique


Author(s):  
K. Jaya ◽  
Deepa Gupta

Even though lot of Statistical Machine Translation(SMT) research work is happening for English-Hindi language pair, there is no effort done to standardize the dataset. Each of the research work uses different dataset, different parameters and different number of sentences during various phases of translation resulting in varied translation output. So comparing  these models, understand the result of these models, to get insight into corpus behavior for these models, regenerating the result of these research work  becomes tedious. This necessitates the need for standardization of dataset and to identify the common parameter for the development of model.  The main contribution of this paper is to discuss an approach to standardize the dataset and to identify the best parameter which in combination gives best performance. It also investigates a novel corpus augmentation approach to improve the translation quality of English-Hindi bidirectional statistical machine translation system. This model works well for the scarce resource without incorporating the external parallel data corpus of the underlying language.  This experiment is carried out using Open Source phrase-based toolkit Moses. Indian Languages Corpora Initiative (ILCI) Hindi-English tourism corpus is used.  With limited dataset, considerable improvement is achieved using the corpus augmentation approach for the English-Hindi bidirectional SMT system.


2020 ◽  
Vol 21 (3) ◽  
pp. 543-554
Author(s):  
Neha Bhadwal ◽  
Prateek Agrawal ◽  
Vishu Madaan

Machine Translation is an area of Natural Language Processing which can replace the laborious task of manual translation. Sanskrit language is among the ancient Indo-Aryan languages. There are numerous works of art and literature in Sanskrit. It has also been a medium for creating treatise of philosophical work as well as works on logic, astronomy and mathematics. On the other hand, Hindi is the most prominent language of India. Moreover,it is among the most widely spoken languages across the world. This paper is an effort to bridge the language barrier between Hindi and Sanskrit language such that any text in Hindi can be translated to Sanskrit. The technique used for achieving the aforesaid objective is rule-based machine translation. The salient linguistic features of the two languages are used to perform the translation. The results are produced in the form of two confusion matrices wherein a total of 50 random sentences and 100 tokens (Hindi words or phrases) were taken for system evaluation. The semantic evaluation of 100 tokens produce an accuracy of 94% while the pragmatic analysis of 50 sentences produce an accuracy of around 86%. Hence, the proposed system can be used to understand the whole translation process and can further be employed as a tool for learning as well as teaching. Further, this application can be embedded in local communication based assisting Internet of Things (IoT) devices like Alexa or Google Assistant.


2021 ◽  
Vol 40 ◽  
pp. 03026
Author(s):  
Nilesh Shirsath ◽  
Aniruddha Velankar ◽  
Ranjeet Patil ◽  
Shilpa Shinde

Machine Translation (MT) is a generic term for computerised systems that generate translations from one natural language to another, with or without human intervention. Text may be used to examine knowledge, and turning that information into pictures helps people to communicate and acquire information.There seems to be a lot of work conducted on translating English to Hindi, Tamil, Bangla and other languages. The important parts of translation are to provide translated sentences with correct words and proper grammar. There has been a comprehensive review of 10 primary publications used in research. Two separate approaches are proposed, one uses rule based approach and other uses neural-machine translation approach to translate basic Marathi phrases to English. While designed primarily for Marathi-English language pairs, the design can be applied to other language pairs with a similar structure.


Author(s):  
K. Jaya ◽  
Deepa Gupta

Even though lot of Statistical Machine Translation(SMT) research work is happening for English-Hindi language pair, there is no effort done to standardize the dataset. Each of the research work uses different dataset, different parameters and different number of sentences during various phases of translation resulting in varied translation output. So comparing  these models, understand the result of these models, to get insight into corpus behavior for these models, regenerating the result of these research work  becomes tedious. This necessitates the need for standardization of dataset and to identify the common parameter for the development of model.  The main contribution of this paper is to discuss an approach to standardize the dataset and to identify the best parameter which in combination gives best performance. It also investigates a novel corpus augmentation approach to improve the translation quality of English-Hindi bidirectional statistical machine translation system. This model works well for the scarce resource without incorporating the external parallel data corpus of the underlying language.  This experiment is carried out using Open Source phrase-based toolkit Moses. Indian Languages Corpora Initiative (ILCI) Hindi-English tourism corpus is used.  With limited dataset, considerable improvement is achieved using the corpus augmentation approach for the English-Hindi bidirectional SMT system.


Sign in / Sign up

Export Citation Format

Share Document