An Overview of the Basic NLP Resources Towards Building the Assamese-English Machine Translation

Proceedings of Intelligent Computing and Technologies Conference ◽

10.21467/proceedings.115.7 ◽

2021 ◽

Author(s):

Nibedita Roy ◽

Apurbalal Senapati

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Daily Life ◽

Point Of View ◽

Advanced Stage ◽

Computational Point ◽

Input Text ◽

Level Performance

Machine Translation (MT) is the process of automatically converting one natural language into another, preserving the exact meaning of the input text to the output text. It is one of the classical problems in the Natural Language Processing (NLP) domain and there is a wide application in our daily life. Though the research in MT in English and some other language is relatively in an advanced stage, but for most of the languages, it is far from the human-level performance in the translation task. From the computational point of view, for MT a lot of preprocessing and basic NLP tools and resources are needed. This study gives an overview of the available basic NLP resources in the context of Assamese-English machine translation.

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text

A Machine Learning Application for Raising WASH Awareness in the Times of COVID-19 Pandemic (Preprint)

10.2196/preprints.25320 ◽

2020 ◽

Cited By ~ 1

Author(s):

Rohan Pandey ◽

Vaibhav Gautam ◽

Ridam Pal ◽

Harsh Bandhey ◽

Lovedeep Singh Dhingra ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

User Feedback ◽

Who Guidelines ◽

The Times ◽

The Right ◽

Local Languages

BACKGROUND The COVID-19 pandemic has uncovered the potential of digital misinformation in shaping the health of nations. The deluge of unverified information that spreads faster than the epidemic itself is an unprecedented phenomenon that has put millions of lives in danger. Mitigating this ‘Infodemic’ requires strong health messaging systems that are engaging, vernacular, scalable, effective and continuously learn the new patterns of misinformation. OBJECTIVE We created WashKaro, a multi-pronged intervention for mitigating misinformation through conversational AI, machine translation and natural language processing. WashKaro provides the right information matched against WHO guidelines through AI, and delivers it in the right format in local languages. METHODS We theorize (i) an NLP based AI engine that could continuously incorporate user feedback to improve relevance of information, (ii) bite sized audio in the local language to improve penetrance in a country with skewed gender literacy ratios, and (iii) conversational but interactive AI engagement with users towards an increased health awareness in the community. RESULTS A total of 5026 people who downloaded the app during the study window, among those 1545 were active users. Our study shows that 3.4 times more females engaged with the App in Hindi as compared to males, the relevance of AI-filtered news content doubled within 45 days of continuous machine learning, and the prudence of integrated AI chatbot “Satya” increased thus proving the usefulness of an mHealth platform to mitigate health misinformation. CONCLUSIONS We conclude that a multi-pronged machine learning application delivering vernacular bite-sized audios and conversational AI is an effective approach to mitigate health misinformation. CLINICALTRIAL Not Applicable

Download Full-text

On Application of Natural Language Processing in Machine Translation

2018 3rd International Conference on Mechanical, Control and Computer Engineering (ICMCCE) ◽

10.1109/icmcce.2018.00112 ◽

2018 ◽

Cited By ~ 3

Author(s):

Zhaorong Zong ◽

Changchun Hong

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing

Download Full-text

Metrics for evaluating phonetics machine translation in Natural Language Processing through modified Edit Distance algorithm-A naïve approach

2015 International Conference on Computer Communication and Informatics (ICCCI) ◽

10.1109/iccci.2015.7218113 ◽

2015 ◽

Cited By ~ 1

Author(s):

M Hanumanthappa ◽

Rashmi S ◽

Mallamma V Reddy

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Edit Distance

Download Full-text

Natural Language Processing

Annual Review of Applied Linguistics ◽

10.1017/s0267190500001446 ◽

1996 ◽

Vol 16 ◽

pp. 70-85 ◽

Cited By ~ 5

Author(s):

Thomas C. Rindflesch

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Computational Linguistics ◽

Language Processing ◽

Domain Knowledge ◽

The State ◽

Point Of View ◽

Computer Applications ◽

Significant Progress ◽

Future Directions

Work in computational linguistics began very soon after the development of the first computers (Booth, Brandwood and Cleave 1958), yet in the intervening four decades there has been a pervasive feeling that progress in computer understanding of natural language has not been commensurate with progress in other computer applications. Recently, a number of prominent researchers in natural language processing met to assess the state of the discipline and discuss future directions (Bates and Weischedel 1993). The consensus of this meeting was that increased attention to large amounts of lexical and domain knowledge was essential for significant progress, and current research efforts in the field reflect this point of view.

Download Full-text

Biomedical Concept Recognition Using Deep Neural Sequence Models

10.1101/530337 ◽

2019 ◽

Cited By ~ 2

Author(s):

Negacy D. Hailu ◽

Michael Bada ◽

Asmelash Teka Hadgu ◽

Lawrence E. Hunter

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

State Of The Art ◽

Conditional Random Field ◽

Concept Recognition ◽

Performance Improvements ◽

Art Performance

AbstractBackgroundthe automated identification of mentions of ontological concepts in natural language texts is a central task in biomedical information extraction. Despite more than a decade of effort, performance in this task remains below the level necessary for many applications.Resultsrecently, applications of deep learning in natural language processing have demonstrated striking improvements over previously state-of-the-art performance in many related natural language processing tasks. Here we demonstrate similarly striking performance improvements in recognizing biomedical ontology concepts in full text journal articles using deep learning techniques originally developed for machine translation. For example, our best performing system improves the performance of the previous state-of-the-art in recognizing terms in the Gene Ontology Biological Process hierarchy, from a previous best F1 score of 0.40 to an F1 of 0.70, nearly halving the error rate. Nearly all other ontologies show similar performance improvements.ConclusionsA two-stage concept recognition system, which is a conditional random field model for span detection followed by a deep neural sequence model for normalization, improves the state-of-the-art performance for biomedical concept recognition. Treating the biomedical concept normalization task as a sequence-to-sequence mapping task similar to neural machine translation improves performance.

Download Full-text

VNLP: Visible natural language processing

Information Visualization ◽

10.1177/14738716211038898 ◽

2021 ◽

pp. 147387162110388

Author(s):

Mohammad Alharbi ◽

Matthew Roach ◽

Tom Cheesman ◽

Robert S Laramee

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Black Box ◽

User Preferences ◽

Text Similarity ◽

Input Text ◽

Visually Based ◽

Pipeline Design

In general, Natural Language Processing (NLP) algorithms exhibit black-box behavior. Users input text and output are provided with no explanation of how the results are obtained. In order to increase understanding and trust, users value transparent processing which may explain derived results and enable understanding of the underlying routines. Many approaches take an opaque approach by default when designing NLP tools and do not incorporate a means to steer and manipulate the intermediate NLP steps. We present an interactive, customizable, visual framework that enables users to observe and participate in the NLP pipeline processes, explicitly manipulate the parameters of each step, and explore the result visually based on user preferences. The visible NLP (VNLP) pipeline design is then applied to a text similarity application to demonstrate the utility and advantages of a visible and transparent NLP pipeline in supporting users to understand and justify both the process and results. We also report feedback on our framework from a modern languages expert.

Download Full-text

A Novel Natural Language Processing (NLP)–Based Machine Translation Model for English to Pakistan Sign Language Translation

Cognitive Computation ◽

10.1007/s12559-020-09731-7 ◽

2020 ◽

Vol 12 (4) ◽

pp. 748-765

Author(s):

Nabeel Sabir Khan ◽

Adnan Abid ◽

Kamran Abid

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sign Language ◽

Machine Translation ◽

Language Processing ◽

Language Translation ◽

Translation Model

Download Full-text

Élaboration d’outils méthodologiques pour décrire les prédicats du français

Lingvisticae Investigationes ◽

10.1075/li.30.2.04gre ◽

2007 ◽

Vol 30 (2) ◽

pp. 217-245 ◽

Cited By ~ 1

Author(s):

Aude Grezka ◽

Pierre-André Buvet

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Point Of View ◽

Theoretical Point ◽

Satisfactory Description

A satisfactory description of the predicates from the theoretical point of view is necessary when elaborating electronic dictionaries meant for natural language processing. We focus on the methodology and descriptors we use within the framework of the classes of objects. At the same time, we put forward complementary descriptors so as to deal with predicative polysemy. Our analysis is illustrated by means of perception predicates.

Download Full-text

PhraseAttn: Dynamic Slot Capsule Networks for phrase representation in Neural Machine Translation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-212101 ◽

2021 ◽

pp. 1-8

Author(s):

Binh Nguyen ◽

Binh Le ◽

Long H.B. Nguyen ◽

Dien Dinh

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Vital Role ◽

Attention Mechanism ◽

Neural Machine Translation ◽

Translation Model ◽

Word Representation

Word representation plays a vital role in most Natural Language Processing systems, especially for Neural Machine Translation. It tends to capture semantic and similarity between individual words well, but struggle to represent the meaning of phrases or multi-word expressions. In this paper, we investigate a method to generate and use phrase information in a translation model. To generate phrase representations, a Primary Phrase Capsule network is first employed, then iteratively enhancing with a Slot Attention mechanism. Experiments on the IWSLT English to Vietnamese, French, and German datasets show that our proposed method consistently outperforms the baseline Transformer, and attains competitive results over the scaled Transformer with two times lower parameters.

Download Full-text