A Review on Electronic Dictionary and Machine Translation System Developed in North-East India

Saiful Islam; Bipul Purkayastha

doi:10.13005/ojcst/10.02.25

A Review on Electronic Dictionary and Machine Translation System Developed in North-East India

Oriental journal of computer science and technology ◽

10.13005/ojcst/10.02.25 ◽

2017 ◽

Vol 10 (2) ◽

pp. 429-437 ◽

Cited By ~ 2

Author(s):

Saiful Islam ◽

Bipul Purkayastha

Keyword(s):

Language Learning ◽

Machine Translation ◽

Language Processing ◽

Human Life ◽

Translation System ◽

Natural Languages ◽

North East India ◽

North East ◽

Electronic Dictionary ◽

Machine Translation System

Electronic Dictionary and Machine Translation system are both the most important language learning tools to achieve the knowledge about the known and unknown natural languages. The natural languages are the most important aspect in human life for communication. Therefore, these two tools are very important and frequently used in human daily life. The Electronic Dictionary (E-dictionary) and Machine Translation (MT) systems are specially very helpful for students, research scholars, teachers, travellers and businessman. The E-dictionary and MT are very important applications and research tasks in Natural Language Processing (NLP). The demand of research task in E-dictionary and MT system are growing in the world as well as in India. North-East (NE) is a very popular and multilingual region of India. Even then, a small number of E-dictionary and MT system have been developed for NE languages. Through this paper, we want to elaborate about the importance, approaches and features of E-dictionary and MT system. This paper also tries to review about the existing E-dictionary and MT system which are developed for NE languages in NE India.

Download Full-text

Direct Machine Translation System from Punjabi to Hindi for Newspapers headlines Domain

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v8i3.3402 ◽

2013 ◽

Vol 8 (3) ◽

pp. 908-912 ◽

Cited By ~ 1

Author(s):

Sumita Rani ◽

Dr. Vijay Luxmi

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Translation System ◽

Natural Languages ◽

Machine Translation System ◽

Common Parent

Machine Translation System is an important area in Natural Language Processing. The Direct MT system is based upon the utilization of syntactic and vocabulary similarities between more or few related natural languages. The relation between two or more languages is based upon their common parent language. The similarity between Punjabi and Hindi languages is due to their parent language Sanskrit. Punjabi and Hindi are closely related languages with lots of similarities in syntax and vocabulary. In the present paper, Direct Machine Translation System from Punjabi to Hindi has been developed and its output is evaluated in order to get the suitability of the system.

Download Full-text

English-Dogri Translation System using MOSES

Circulation in Computer Science ◽

10.22632/ccs-2016-251-25 ◽

2016 ◽

Vol 1 (1) ◽

pp. 45-49

Author(s):

Avinash Singh ◽

Asmeet Kour ◽

Shubhnandan S. Jamwal

Keyword(s):

Natural Language Processing ◽

Machine Translation ◽

Language Processing ◽

Statistical Machine Translation ◽

Translation System ◽

Parallel Corpus ◽

English System ◽

Machine Translation System ◽

Translation Machine ◽

Language Pair

The objective behind this paper is to analyze the English-Dogri parallel corpus translation. Machine translation is the translation from one language into another language. Machine translation is the biggest application of the Natural Language Processing (NLP). Moses is statistical machine translation system allow to train translation models for any language pair. We have developed translation system using Statistical based approach which helps in translating English to Dogri and vice versa. The parallel corpus consists of 98,973 sentences. The system gives accuracy of 80% in translating English to Dogri and the system gives accuracy of 87% in translating Dogri to English system.

Download Full-text

Neural machine translation system for the Kazakh language based on synthetic corpora

MATEC Web of Conferences ◽

10.1051/matecconf/201925203006 ◽

2019 ◽

Vol 252 ◽

pp. 03006

Author(s):

Ualsher Tukeyev ◽

Aidana Karibayeva ◽

Balzhan Abduali

Keyword(s):

Machine Translation ◽

Training Data ◽

Translation System ◽

Natural Languages ◽

Neural Machine Translation ◽

Translation Quality ◽

Parallel Data ◽

Machine Translation System ◽

Turkic Languages

The lack of big parallel data is present for the Kazakh language. This problem seriously impairs the quality of machine translation from and into Kazakh. This article considers the neural machine translation of the Kazakh language on the basis of synthetic corpora. The Kazakh language belongs to the Turkic languages, which are characterised by rich morphology. Neural machine translation of natural languages requires large training data. The article will show the model for the creation of synthetic corpora, namely the generation of sentences based on complete suffixes for the Kazakh language. The novelty of this approach of the synthetic corpora generation for the Kazakh language is the generation of sentences on the basis of the complete system of suffixes of the Kazakh language. By using generated synthetic corpora we are improving the translation quality in neural machine translation of Kazakh-English and Kazakh-Russian pairs.

Download Full-text

Evaluation of inkurdish Machine Translation System

Journal of University of Human Development ◽

10.21928/juhd.v3n2y2017.pp862-868 ◽

2017 ◽

Vol 3 (2) ◽

pp. 862

Author(s):

Kanaan Mikael Kaka-Khan ◽

Fatima Jalal Taher

Keyword(s):

Machine Translation ◽

Language Processing ◽

Translation System ◽

General Evaluation ◽

Machine Translation System ◽

Correct Translation

Lack of having a perfect machine translation for Kurdish language is a huge gap in Kurdish Language processing (KNLP). inkurdish is a first machine translation system for Kurdish language which is capable of translating English into Kurdish sentences. Building "inkurdish" machine translation system was a great point regarding Kurdish language processing, but like any other translation system has strengths as well as many shortcomings and issues. This paper tries to evaluate inkurdish machine translation system according to both linguistics and computational issues. It might help any other researchers interested in doing research in this field. It attempts to evaluate inKurdish from different perspectives, such as, giving un common words, sentences, phrases and paragraphs in this machine to check whether it provides the correct translation or not. A general evaluation can be done after getting a valid sample with their translations from the machine and compared to the meanings of the words outside the machine.

Download Full-text

Word Sense Disambiguation for Improving the Quality of Machine Translation

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.981.153 ◽

2014 ◽

Vol 981 ◽

pp. 153-156

Author(s):

Chun Xiang Zhang ◽

Long Deng ◽

Xue Yao Gao ◽

Li Li Guo

Keyword(s):

Machine Translation ◽

Language Processing ◽

Word Sense Disambiguation ◽

Ambiguous Word ◽

Translation System ◽

Word Sense ◽

Ambiguous Words ◽

Sense Disambiguation ◽

Machine Translation System

Word sense disambiguation is key to many application problems in natural language processing. In this paper, a specific classifier of word sense disambiguation is introduced into machine translation system in order to improve the quality of the output translation. Firstly, translation of ambiguous word is deleted from machine translation of Chinese sentence. Secondly, ambiguous word is disambiguated and the classification labels are translations of ambiguous word. Thirdly, these two translations are combined. 50 Chinese sentences including ambiguous words are collected for test experiments. Experimental results show that the translation quality is improved after the proposed method is applied.

Download Full-text

Hindi Chhattisgarhi Machine Translation System Using Statistical Approach

Webology ◽

10.14704/web/v18si02/web18067 ◽

2021 ◽

Vol 18 (Special Issue 02) ◽

pp. 208-222

Author(s):

Vikas Pandey ◽

Dr.M.V. Padmavati ◽

Dr. Ramesh Kumar

Keyword(s):

Machine Translation ◽

Language Processing ◽

Statistical Approach ◽

Statistical Machine Translation ◽

Target Language ◽

Translation System ◽

Parallel Corpus ◽

Machine Translation System ◽

Unknown Words ◽

Language Pair

Machine Translation is a subfield of Natural language Processing (NLP) which uses to translate source language to target language. In this paper an attempt has been made to make a Hindi Chhattisgarhi machine translation system which is based on statistical approach. In the state of Chhattisgarh there is a long awaited need for Hindi to Chhattisgarhi machine translation system for converting Hindi into Chhattisgarhi especially for non Chhattisgarhi speaking people. In order to develop Hindi Chhattisgarhi statistical machine translation system an open source software called Moses is used. Moses is a statistical machine translation system and used to automatically train the translation model for Hindi Chhattisgarhi language pair called as parallel corpus. A collection of structured text to study linguistic properties is called corpus. This machine translation system works on parallel corpus of 40,000 Hindi-Chhattisgarhi bilingual sentences. In order to overcome translation problem related to proper noun and unknown words, a transliteration system is also embedded in it. These sentences are extracted from various domains like stories, novels, text books and news papers etc. This system is tested on 1000 sentences to check the grammatical correctness of sentences and it was found that an accuracy of 75% is achieved.

Download Full-text

A Machine Translation System from Hindi to Sanskrit Language using Rule based Approach

Scalable Computing Practice and Experience ◽

10.12694/scpe.v21i3.1783 ◽

2020 ◽

Vol 21 (3) ◽

pp. 543-554

Author(s):

Neha Bhadwal ◽

Prateek Agrawal ◽

Vishu Madaan

Keyword(s):

Machine Translation ◽

Language Processing ◽

Translation System ◽

Linguistic Features ◽

Rule Based ◽

Pragmatic Analysis ◽

Machine Translation System ◽

Iot Devices ◽

And Mathematics ◽

Rule Based Approach

Machine Translation is an area of Natural Language Processing which can replace the laborious task of manual translation. Sanskrit language is among the ancient Indo-Aryan languages. There are numerous works of art and literature in Sanskrit. It has also been a medium for creating treatise of philosophical work as well as works on logic, astronomy and mathematics. On the other hand, Hindi is the most prominent language of India. Moreover,it is among the most widely spoken languages across the world. This paper is an effort to bridge the language barrier between Hindi and Sanskrit language such that any text in Hindi can be translated to Sanskrit. The technique used for achieving the aforesaid objective is rule-based machine translation. The salient linguistic features of the two languages are used to perform the translation. The results are produced in the form of two confusion matrices wherein a total of 50 random sentences and 100 tokens (Hindi words or phrases) were taken for system evaluation. The semantic evaluation of 100 tokens produce an accuracy of 94% while the pragmatic analysis of 50 sentences produce an accuracy of around 86%. Hence, the proposed system can be used to understand the whole translation process and can further be employed as a tool for learning as well as teaching. Further, this application can be embedded in local communication based assisting Internet of Things (IoT) devices like Alexa or Google Assistant.

Download Full-text

A Survey of Orthographic Information in Machine Translation

SN Computer Science ◽

10.1007/s42979-021-00723-4 ◽

2021 ◽

Vol 2 (4) ◽

Author(s):

Bharathi Raja Chakravarthi ◽

Priya Rani ◽

Mihael Arcan ◽

John P. McCrae

Keyword(s):

Machine Translation ◽

Language Processing ◽

Orthographic Knowledge ◽

Translation System ◽

Neural Machine Translation ◽

Machine Translation System ◽

Translation Methods ◽

Traditional Approaches ◽

Translation Systems ◽

Different Levels

AbstractMachine translation is one of the applications of natural language processing which has been explored in different languages. Recently researchers started paying attention towards machine translation for resource-poor languages and closely related languages. A widespread and underlying problem for these machine translation systems is the linguistic difference and variation in orthographic conventions which causes many issues to traditional approaches. Two languages written in two different orthographies are not easily comparable but orthographic information can also be used to improve the machine translation system. This article offers a survey of research regarding orthography’s influence on machine translation of under-resourced languages. It introduces under-resourced languages in terms of machine translation and how orthographic information can be utilised to improve machine translation. We describe previous work in this area, discussing what underlying assumptions were made, and showing how orthographic knowledge improves the performance of machine translation of under-resourced languages. We discuss different types of machine translation and demonstrate a recent trend that seeks to link orthographic information with well-established machine translation methods. Considerable attention is given to current efforts using cognate information at different levels of machine translation and the lessons that can be drawn from this. Additionally, multilingual neural machine translation of closely related languages is given a particular focus in this survey. This article ends with a discussion of the way forward in machine translation with orthographic information, focusing on multilingual settings and bilingual lexicon induction.

Download Full-text

Positional and combinational characteristics of terms

Terminology ◽

10.1075/term.1.1.06nkw ◽

1994 ◽

Vol 1 (1) ◽

pp. 61-95 ◽

Cited By ~ 3

Author(s):

Blaise Nkwenti-Azeh

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Satellite Communications ◽

Translation System ◽

Machine Translation System ◽

Term Identification ◽

Identification Strategy ◽

Special Language

Special-language term formation is characterised, inter alia, by the frequent reuse of certain lexical items in the formation of new syntagmatic units and by conceptually motivated restrictions on the position which certain elements can occupy within a compound term. This paper describes how the positional and combinational features of the terminology of a given domain can be identified from relevant existing term lists and used as part of a corpus-based, automatic term-identification strategy within a natural-language processing (e.g., machine-translation) system. The methodology described is exemplified and supported with data from the field of satellite communications.

Download Full-text

COMPREHENSIVE APPROACH FOR BILINGUAL MACHINE TRANSLATION

International Journal of Computer and Communication Technology ◽

10.47893/ijcct.2017.1413 ◽

2017 ◽

pp. 126-129

Author(s):

SHARANBASAPPA HONNASHETTY ◽

DR. M. HANUMANTHAPPA

Keyword(s):

Machine Translation ◽

Language Processing ◽

Translation System ◽

Major Focus ◽

Complex Sentences ◽

Novel Approach ◽

Machine Translation System ◽

Bilingual Corpora ◽

Simple Sentences ◽

Processing Group

Machine Translation has been a major focus of the NLP group since 1999, the principal focus of the Natural Language Processing group is to build a machine translation system that automatically learns translation mappings from bilingual corpora. This paper explores a novel approach for phrase based machine translation from English to Kannada and Kannada to English. The source text is analyzed then simple sentences are translated using the rules and the complex sentences are split into simple sentences later translation is performed.

Download Full-text