English to Hindi Machine Translation System in the Context of Homoeopathy Literature

2016 ◽  
Vol 6 (1) ◽  
pp. 46-62
Author(s):  
Pramod P. Sukhadeve

Over the years, researches in machine translation (MT) systems have gain momentum due to their widespread applicability. A number of systems have come up doing the task successfully for different language pairs. However, to the best of the author's knowledge, no significant work has been done in clinical and medical related domain especially in Homoeopathy. This paper describes a rule based English-Hindi MT system for Homoeopathic sentences. It has been designed to translate a variety of sentences from Homoeopathic literature. To achieve the task, the author developed English and Hindi Homoeopathic corpuses presently having the size 21096 and 23145 sentences respectively. For translation, the input sentences (in English) have been categorised in four different type's i.e. simple, complex, interrogative and ambiguous sentences. The authors tested the translation accuracy using BLEU score. At present, the overall Bleu score of the system is 0.7808 and the accuracy percentage is 82.25%.

Electronics ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 201
Author(s):  
Jin-Xia Huang ◽  
Kyung-Soon Lee ◽  
Young-Kil Kim

This paper proposes a hybrid machine-translation system that combines neural machine translation with well-developed rule-based machine translation to utilize the stability of the latter to compensate for the inadequacy of neural machine translation in rare-resource domains. A classifier is introduced to predict which translation from the two systems is more reliable. We explore a set of features that reflect the reliability of translation and its process, and training data is automatically expanded with a small, human-labeled dataset to solve the insufficient-data problem. A series of experiments shows that the hybrid system’s translation accuracy is improved, especially in out-of-domain translations, and classification accuracy is greatly improved when using the proposed features and the automatically constructed training set. A comparison between feature- and text-based classification is also performed, and the results show that the feature-based model achieves better classification accuracy, even when compared to neural network text classifiers.


2021 ◽  
Vol 11 (2) ◽  
pp. 489-501
Author(s):  
Trond Trosterud ◽  
Lene Antonsen

The article presents a rule-based machine translation system from Northern Sami to Norwegian. The grammatical analysis is done with Giellatekno and Divvun's North Sami program for analysis and translation. We have written the transfer component (transfer lexicon and grammatical rules) within the framework of the open machine translation system Apertium. The article contains an evaluation of translated text for two different domains. The translated texts score better on the presentation of the content than on fluent language. By classifying the errors into lexical, grammatical and pragmatic errors, we show that lexical errors are the most harmful for text comprehension. The other two types of errors give a poor language quality, but they have little effect on comprehension. The type of error that is the easiest to correct is the lexical, which is a promising conclusion for the development of a machine translation system for text comprehension.


2017 ◽  
Vol 108 (1) ◽  
pp. 221-232
Author(s):  
Francis M. Tyers ◽  
Hèctor Alòs i Font ◽  
Gianfranco Fronteddu ◽  
Adrià Martín-Mor

AbstractThis paper describes the process of creation of the first machine translation system from Italian to Sardinian, a Romance language spoken on the island of Sardinia in the Mediterranean. The project was carried out by a team of translators and computational linguists. The article focuses on the technology used (Rule-Based Machine Translation) and on some of the rules created, as well as on the orthographic model used for Sardinian.


2016 ◽  
Vol 106 (1) ◽  
pp. 159-168 ◽  
Author(s):  
Julian Hitschler ◽  
Laura Jehl ◽  
Sariya Karimova ◽  
Mayumi Ohta ◽  
Benjamin Körner ◽  
...  

Abstract We present Otedama, a fast, open-source tool for rule-based syntactic pre-ordering, a well established technique in statistical machine translation. Otedama implements both a learner for pre-ordering rules, as well as a component for applying these rules to parsed sentences. Our system is compatible with several external parsers and capable of accommodating many source and all target languages in any machine translation paradigm which uses parallel training data. We demonstrate improvements on a patent translation task over a state-of-the-art English-Japanese hierarchical phrase-based machine translation system. We compare Otedama with an existing syntax-based pre-ordering system, showing comparable translation performance at a runtime speedup of a factor of 4.5-10.


Sign in / Sign up

Export Citation Format

Share Document