scholarly journals Rule-Based Machine Translation for the Italian–Sardinian Language Pair

2017 ◽  
Vol 108 (1) ◽  
pp. 221-232
Author(s):  
Francis M. Tyers ◽  
Hèctor Alòs i Font ◽  
Gianfranco Fronteddu ◽  
Adrià Martín-Mor

AbstractThis paper describes the process of creation of the first machine translation system from Italian to Sardinian, a Romance language spoken on the island of Sardinia in the Mediterranean. The project was carried out by a team of translators and computational linguists. The article focuses on the technology used (Rule-Based Machine Translation) and on some of the rules created, as well as on the orthographic model used for Sardinian.

2016 ◽  
Vol 1 (1) ◽  
pp. 45-49
Author(s):  
Avinash Singh ◽  
Asmeet Kour ◽  
Shubhnandan S. Jamwal

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.


2016 ◽  
Vol 6 (1) ◽  
pp. 46-62
Author(s):  
Pramod P. Sukhadeve

Over the years, researches in machine translation (MT) systems have gain momentum due to their widespread applicability. A number of systems have come up doing the task successfully for different language pairs. However, to the best of the author's knowledge, no significant work has been done in clinical and medical related domain especially in Homoeopathy. This paper describes a rule based English-Hindi MT system for Homoeopathic sentences. It has been designed to translate a variety of sentences from Homoeopathic literature. To achieve the task, the author developed English and Hindi Homoeopathic corpuses presently having the size 21096 and 23145 sentences respectively. For translation, the input sentences (in English) have been categorised in four different type's i.e. simple, complex, interrogative and ambiguous sentences. The authors tested the translation accuracy using BLEU score. At present, the overall Bleu score of the system is 0.7808 and the accuracy percentage is 82.25%.


To bridge the language constraint of the people residing in northeastern region of India, machine translation system is a necessity. Large number of people in this region cannot access many services due to the language incomprehensibility. Among several languages spoken, Assamese is one of the major languages used in northeast India. Machine translation for Assamese language is limited compared to other languages. As a result, large number of people using Assamese language cannot avail lots of benefits associated with it. This paper has focused on the development of the English to Assamese translation system using n-gram model. The n-gram model works very well with the language pair having high dissimilarity in syntax compared to other models. The value of n has a very big role in the quality and efficiency of the system. Bilingual Evaluation Understudy (BLEU) score differs significantly with the change of the n-gram. This model uses tuples to reduce the consumption of excess memory and to accelerate the translation process. Parallel corpus has been used for training the n-gram based decoder called MARIE. The number of translation units extracted using n-gram model is much less than the translation units extracted using phrase based model. This has a high impact on system efficiency.


2021 ◽  
Vol 11 (2) ◽  
pp. 489-501
Author(s):  
Trond Trosterud ◽  
Lene Antonsen

The article presents a rule-based machine translation system from Northern Sami to Norwegian. The grammatical analysis is done with Giellatekno and Divvun's North Sami program for analysis and translation. We have written the transfer component (transfer lexicon and grammatical rules) within the framework of the open machine translation system Apertium. The article contains an evaluation of translated text for two different domains. The translated texts score better on the presentation of the content than on fluent language. By classifying the errors into lexical, grammatical and pragmatic errors, we show that lexical errors are the most harmful for text comprehension. The other two types of errors give a poor language quality, but they have little effect on comprehension. The type of error that is the easiest to correct is the lexical, which is a promising conclusion for the development of a machine translation system for text comprehension.


2016 ◽  
Vol 106 (1) ◽  
pp. 159-168 ◽  
Author(s):  
Julian Hitschler ◽  
Laura Jehl ◽  
Sariya Karimova ◽  
Mayumi Ohta ◽  
Benjamin Körner ◽  
...  

Abstract We present Otedama, a fast, open-source tool for rule-based syntactic pre-ordering, a well established technique in statistical machine translation. Otedama implements both a learner for pre-ordering rules, as well as a component for applying these rules to parsed sentences. Our system is compatible with several external parsers and capable of accommodating many source and all target languages in any machine translation paradigm which uses parallel training data. We demonstrate improvements on a patent translation task over a state-of-the-art English-Japanese hierarchical phrase-based machine translation system. We compare Otedama with an existing syntax-based pre-ordering system, showing comparable translation performance at a runtime speedup of a factor of 4.5-10.


Sign in / Sign up

Export Citation Format

Share Document