Comparative Analysis of Language Translation and Detection System Using Machine Learning

Aishwarya R. Verma

doi:10.22214/ijraset.2021.37577

Comparative Analysis of Language Translation and Detection System Using Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.37577 ◽

2021 ◽

Vol 9 (8) ◽

pp. 1200-1211

Author(s):

Aishwarya R. Verma

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Machine Translation ◽

Language Processing ◽

Short Term Memory ◽

Detection System ◽

English Text ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Abstract: Words are the meaty component which can be expressed through speech, writing or signals. It is important that the actual message or meaning of the words sent must conveys the same meaning to the one receives. The evolution from manual language translator to the digital machine translation have helped us a lot for finding the exact meaning such that each word must give at least close to exact actual meaning. To make machine translator more human-friendly feeling, natural language processing (NLP) with machine learning (ML) can make the best combination. The main challenges in machine translated sentence can involve ambiguities, lexical divergence, syntactic, lexical mismatches, semantic issues, etc. which can be seen in grammar, spellings, punctuations, spaces, etc. After analysis on different algorithms, we have implemented a two different machine translator using two different Long Short-Term Memory (LSTM) approaches and performed the comparative study of the quality of the translated text based on their respective accuracy. We have used two different training approaches of encodingdecoding techniques using same datasets, which translates the source English text to the target Hindi text. To detect the text entered is English or Hindi language, we have used Sequential LSTM training model for which the analysis has been performed based on its accuracy. As the result, the first LSTM trained model is 84% accurate and the second LSTM trained model is 71% accurate in its translation from English to Hindi text, while the detection LSTM trained model is 78% accurate in detecting English text and 81% accurate in detecting Hindi text. This study has helped us to analyze the appropriate machine translation based on its accuracy. Keywords: Accuracy, Decoding, Machine Learning (ML), Detection System, Encoding, Long Short-Term Memory (LSTM), Machine Translation, Natural Language Processing (NLP), Sequential

Download Full-text

COVID-19 ChatBot

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38757 ◽

2021 ◽

Vol 9 (11) ◽

pp. 44-49

Author(s):

Satish Tirumalapudi

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

And Control ◽

Prediction Problems

Abstract: Chat bots are software applications that help users to communicate with the machine and get the required result, this is where Natural Language Processing (NLP) comes into the picture. Natural language processing is based on deep learning that enables computers to acquire meaning from inputs given by the users. Natural language processing techniques can make possible the use of natural language to express ideas, thus drastically increasing accessibility. NLP engines rely on the elements of intent, utterance, entity, context, and session. Here in this project, we will be using Deep learning techniques which will be trained on the dataset which contains categories, patterns, and responses. Long Short-Term Memory (LSTM) is a Recurrent Neural Network that is capable of learning order dependence in sequence prediction problems. One of the most popular RNN approaches is LSTM to identify and control a dynamic system. We use an RNN to classify the category user’s message belongs to and then will give a response from the list of responses. Keywords: NLP – Natural Language Processing, LSTM – Long Short Term Memory, RNN – Recurrent Neural Networks.

Download Full-text

Penerapan Convolutional Long Short-Term Memory untuk Klasifikasi Teks Berita Bahasa Indonesia

Jurnal Nasional Teknik Elektro dan Teknologi Informasi (JNTETI) ◽

10.22146/jnteti.v10i4.2438 ◽

2021 ◽

Vol 10 (4) ◽

pp. 354-361

Author(s):

Yudi Widhiyasana ◽

Transmissia Semiawan ◽

Ilham Gibran Achmad Mudzakir ◽

Muhammad Randi Noor

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Learning Rate ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Bahasa Indonesia

Klasifikasi teks saat ini telah menjadi sebuah bidang yang banyak diteliti, khususnya terkait Natural Language Processing (NLP). Terdapat banyak metode yang dapat dimanfaatkan untuk melakukan klasifikasi teks, salah satunya adalah metode deep learning. RNN, CNN, dan LSTM merupakan beberapa metode deep learning yang umum digunakan untuk mengklasifikasikan teks. Makalah ini bertujuan menganalisis penerapan kombinasi dua buah metode deep learning, yaitu CNN dan LSTM (C-LSTM). Kombinasi kedua metode tersebut dimanfaatkan untuk melakukan klasifikasi teks berita bahasa Indonesia. Data yang digunakan adalah teks berita bahasa Indonesia yang dikumpulkan dari portal-portal berita berbahasa Indonesia. Data yang dikumpulkan dikelompokkan menjadi tiga kategori berita berdasarkan lingkupnya, yaitu “Nasional”, “Internasional”, dan “Regional”. Dalam makalah ini dilakukan eksperimen pada tiga buah variabel penelitian, yaitu jumlah dokumen, ukuran batch, dan nilai learning rate dari C-LSTM yang dibangun. Hasil eksperimen menunjukkan bahwa nilai F1-score yang diperoleh dari hasil klasifikasi menggunakan metode C-LSTM adalah sebesar 93,27%. Nilai F1-score yang dihasilkan oleh metode C-LSTM lebih besar dibandingkan dengan CNN, dengan nilai 89,85%, dan LSTM, dengan nilai 90,87%. Dengan demikian, dapat disimpulkan bahwa kombinasi dua metode deep learning, yaitu CNN dan LSTM (C-LSTM),memiliki kinerja yang lebih baik dibandingkan dengan CNN dan LSTM.

Download Full-text

Natural language processing and deep learning chatbot using long short term memory algorithm

Materials Today Proceedings ◽

10.1016/j.matpr.2021.04.154 ◽

2021 ◽

Author(s):

E. Kasthuri ◽

S. Balaji

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

Download Full-text

Extracting medications and associated adverse drug events using a natural language processing system combining knowledge base and deep learning

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocz141 ◽

2019 ◽

Vol 27 (1) ◽

pp. 56-64 ◽

Cited By ~ 9

Author(s):

Long Chen ◽

Yu Gu ◽

Xin Ji ◽

Zhiyong Sun ◽

Haodan Li ◽

...

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Language Processing ◽

Adverse Drug Events ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Related Information ◽

Long Short Term Memory ◽

Medical Concepts

Abstract Objective Detecting adverse drug events (ADEs) and medications related information in clinical notes is important for both hospital medical care and medical research. We describe our clinical natural language processing (NLP) system to automatically extract medical concepts and relations related to ADEs and medications from clinical narratives. This work was part of the 2018 National NLP Clinical Challenges Shared Task and Workshop on Adverse Drug Events and Medication Extraction. Materials and Methods The authors developed a hybrid clinical NLP system that employs a knowledge-based general clinical NLP system for medical concepts extraction, and a task-specific deep learning system for relations identification using attention-based bidirectional long short-term memory networks. Results The systems were evaluated as part of the 2018 National NLP Clinical Challenges challenge, and our attention-based bidirectional long short-term memory networks based system obtained an F-measure of 0.9442 for relations identification task, ranking fifth at the challenge, and had <2% difference from the best system. Error analysis was also conducted targeting at figuring out the root causes and possible approaches for improvement. Conclusions We demonstrate the generic approaches and the practice of connecting general purposed clinical NLP system to task-specific requirements with deep learning methods. Our results indicate that a well-designed hybrid NLP system is capable of ADE and medication-related information extraction, which can be used in real-world applications to support ADE-related researches and medical decisions.

Download Full-text

A COMBINED DEEP LEARNING MODEL FOR PERSIAN SENTIMENT ANALYSIS

IIUM Engineering Journal ◽

10.31436/iiumej.v20i1.1036 ◽

2019 ◽

Vol 20 (1) ◽

pp. 129-139 ◽

Cited By ~ 2

Author(s):

Zahra Bokaee Nezhad ◽

Mohammad Ali Deihimi

Keyword(s):

Deep Learning ◽

Natural Language Processing ◽

Sentiment Analysis ◽

Language Processing ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Proposed Model ◽

Long Short Term Memory ◽

Deep Learning Model

With increasing members in social media sites today, people tend to share their views about everything online. It is a convenient way to convey their messages to end users on a specific subject. Sentiment Analysis is a subfield of Natural Language Processing (NLP) that refers to the identification of users’ opinions toward specific topics. It is used in several fields such as marketing, customer services, etc. However, limited works have been done on Persian Sentiment Analysis. On the other hand, deep learning has recently become popular because of its successful role in several Natural Language Processing tasks. The objective of this paper is to propose a novel hybrid deep learning architecture for Persian Sentiment Analysis. According to the proposed model, local features are extracted by Convolutional Neural Networks (CNN) and long-term dependencies are learned by Long Short Term Memory (LSTM). Therefore, the model can harness both CNN's and LSTM's abilities. Furthermore, Word2vec is used for word representation as an unsupervised learning step. To the best of our knowledge, this is the first attempt where a hybrid deep learning model is used for Persian Sentiment Analysis. We evaluate the model on a Persian dataset that is introduced in this study. The experimental results show the effectiveness of the proposed model with an accuracy of 85%. ABSTRAK: Hari ini dengan ahli yang semakin meningkat di laman media sosial, orang cenderung untuk berkongsi pandangan mereka tentang segala-galanya dalam talian. Ini adalah cara mudah untuk menyampaikan mesej mereka kepada pengguna akhir mengenai subjek tertentu. Analisis Sentimen adalah subfield Pemprosesan Bahasa Semula Jadi yang merujuk kepada pengenalan pendapat pengguna ke arah topik tertentu. Ia digunakan dalam beberapa bidang seperti pemasaran, perkhidmatan pelanggan, dan sebagainya. Walau bagaimanapun, kerja-kerja terhad telah dilakukan ke atas Analisis Sentimen Parsi. Sebaliknya, pembelajaran mendalam baru menjadi popular kerana peranannya yang berjaya dalam beberapa tugas Pemprosesan Bahasa Asli (NLP). Objektif makalah ini adalah mencadangkan senibina pembelajaran hibrid yang baru dalam Analisis Sentimen Parsi. Menurut model yang dicadangkan, ciri-ciri tempatan ditangkap oleh Rangkaian Neural Convolutional (CNN) dan ketergantungan jangka panjang dipelajari oleh Long Short Term Memory (LSTM). Oleh itu, model boleh memanfaatkan kebolehan CNN dan LSTM. Selain itu, Word2vec digunakan untuk perwakilan perkataan sebagai langkah pembelajaran tanpa pengawasan. Untuk pengetahuan yang terbaik, ini adalah percubaan pertama di mana model pembelajaran mendalam hibrid digunakan untuk Analisis Sentimen Persia. Kami menilai model pada dataset Persia yang memperkenalkan dalam kajian ini. Keputusan eksperimen menunjukkan keberkesanan model yang dicadangkan dengan ketepatan 85%.

Download Full-text

Hoax analyzer for Indonesian news using RNNs with fasttext and glove embeddings

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v10i4.2956 ◽

2021 ◽

Vol 10 (4) ◽

pp. 2130-2136

Author(s):

Ryan Adipradana ◽

Bagas Pradipabista Nayoga ◽

Ryan Suryadi ◽

Derwin Suhartono

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

High Resource ◽

Long Short Term Memory ◽

Linguistic Approach ◽

Gated Recurrent Unit

Misinformation has become an innocuous yet potentially harmful problem ever since the development of internet. Numbers of efforts are done to prevent the consumption of misinformation, including the use of artificial intelligence (AI), mainly natural language processing (NLP). Unfortunately, most of natural language processing use English as its linguistic approach since English is a high resource language. On the contrary, Indonesia language is considered a low resource language thus the amount of effort to diminish consumption of misinformation is low compared to English-based natural language processing. This experiment is intended to compare fastText and GloVe embeddings for four deep neural networks (DNN) models: long short-term memory (LSTM), bidirectional long short-term memory (BI-LSTM), gated recurrent unit (GRU) and bidirectional gated recurrent unit (BI-GRU) in terms of metrics score when classifying news between three classes: fake, valid, and satire. The latter results show that fastText embedding is better than GloVe embedding in supervised text classification, along with BI-GRU + fastText yielding the best result.

Download Full-text

Prediction of Head Movement in 360-Degree Videos Using Attention Model

Sensors ◽

10.3390/s21113678 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3678

Author(s):

Dongwon Lee ◽

Minji Choi ◽

Joohyun Lee

Keyword(s):

Machine Learning ◽

Short Term Memory ◽

Moving Average ◽

The Other ◽

Learning Models ◽

Short Term ◽

Term Memory ◽

Attention Model ◽

Long Short Term Memory ◽

Machine Learning Models

In this paper, we propose a prediction algorithm, the combination of Long Short-Term Memory (LSTM) and attention model, based on machine learning models to predict the vision coordinates when watching 360-degree videos in a Virtual Reality (VR) or Augmented Reality (AR) system. Predicting the vision coordinates while video streaming is important when the network condition is degraded. However, the traditional prediction models such as Moving Average (MA) and Autoregression Moving Average (ARMA) are linear so they cannot consider the nonlinear relationship. Therefore, machine learning models based on deep learning are recently used for nonlinear predictions. We use the Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) neural network methods, originated in Recurrent Neural Networks (RNN), and predict the head position in the 360-degree videos. Therefore, we adopt the attention model to LSTM to make more accurate results. We also compare the performance of the proposed model with the other machine learning models such as Multi-Layer Perceptron (MLP) and RNN using the root mean squared error (RMSE) of predicted and real coordinates. We demonstrate that our model can predict the vision coordinates more accurately than the other models in various videos.

Download Full-text

Random forest and long short-term memory based machine learning models for classification of ion mobility spectrometry spectra

Chemical, Biological, Radiological, Nuclear, and Explosives (CBRNE) Sensing XXII ◽

10.1117/12.2585829 ◽

2021 ◽

Author(s):

Patrick C. Riley ◽

Samir V. Deshpande ◽

Brian S. Ince ◽

Brian C. Hauck ◽

Kyle P. O'Donnell ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Ion Mobility ◽

Short Term Memory ◽

Learning Models ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Machine Learning Models

Download Full-text

Hybrid Machine Translation with Multi-Source Encoder-Decoder Long Short-Term Memory in English-Malay Translation

International Journal on Advanced Science Engineering and Information Technology ◽

10.18517/ijaseit.8.4-2.6816 ◽

2018 ◽

Vol 8 (4-2) ◽

pp. 1446

Author(s):

Yin-Lai Yeong ◽

Tien-Ping Tan ◽

Keng Hoon Gan ◽

Siti Khaotijah Mohammad

Keyword(s):

Machine Translation ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Hybrid Machine ◽

Hybrid Machine Translation

Download Full-text

The use of machine translation algorithm based on residual and LSTM neural network in translation teaching

PLoS ONE ◽

10.1371/journal.pone.0240663 ◽

2020 ◽

Vol 15 (11) ◽

pp. e0240663

Author(s):

Beibei Ren

Keyword(s):

Neural Network ◽

Deep Learning ◽

Machine Translation ◽

Short Term Memory ◽

Short Term ◽

Term Memory ◽

Translation Model ◽

Translation Quality ◽

Long Short Term Memory ◽

Deep Learning Neural Network

With the rapid development of big data and deep learning, breakthroughs have been made in phonetic and textual research, the two fundamental attributes of language. Language is an essential medium of information exchange in teaching activity. The aim is to promote the transformation of the training mode and content of translation major and the application of the translation service industry in various fields. Based on previous research, the SCN-LSTM (Skip Convolutional Network and Long Short Term Memory) translation model of deep learning neural network is constructed by learning and training the real dataset and the public PTB (Penn Treebank Dataset). The feasibility of the model’s performance, translation quality, and adaptability in practical teaching is analyzed to provide a theoretical basis for the research and application of the SCN-LSTM translation model in English teaching. The results show that the capability of the neural network for translation teaching is nearly one times higher than that of the traditional N-tuple translation model, and the fusion model performs much better than the single model, translation quality, and teaching effect. To be specific, the accuracy of the SCN-LSTM translation model based on deep learning neural network is 95.21%, the degree of translation confusion is reduced by 39.21% compared with that of the LSTM (Long Short Term Memory) model, and the adaptability is 0.4 times that of the N-tuple model. With the highest level of satisfaction in practical teaching evaluation, the SCN-LSTM translation model has achieved a favorable effect on the translation teaching of the English major. In summary, the performance and quality of the translation model are improved significantly by learning the language characteristics in translations by teachers and students, providing ideas for applying machine translation in professional translation teaching.

Download Full-text