Deep Learning Approach for the Morphological Synthesis in Malayalam and Tamil at the Character Level

Author(s):  
B. Premjith ◽  
K. P. Soman

Morphological synthesis is one of the main components of Machine Translation (MT) frameworks, especially when any one or both of the source and target languages are morphologically rich. Morphological synthesis is the process of combining two words or two morphemes according to the Sandhi rules of the morphologically rich language. Malayalam and Tamil are two languages in India which are morphologically abundant as well as agglutinative. Morphological synthesis of a word in these two languages is challenging basically because of the following reasons: (1) Abundance in morphology; (2) Complex Sandhi rules; (3) The possibilty in Malayalam to form words by combining words that belong to different syntactic categories (for example, noun and verb); and (4) The construction of a sentence by combining multiple words. We formulated the task of the morphological generation of nouns and verbs of Malayalam and Tamil as a character-to-character sequence tagging problem. In this article, we used deep learning architectures like Recurrent Neural Network (RNN) , Long Short-Term Memory Networks (LSTM) , Gated Recurrent Unit (GRU) , and their stacked and bidirectional versions for the implementation of morphological synthesis at the character level. In addition to that, we investigated the performance of the combination of the aforementioned deep learning architectures and the Conditional Random Field (CRF) in the morphological synthesis of nouns and verbs in Malayalam and Tamil. We observed that the addition of CRF to the Bidirectional LSTM/GRU architecture achieved more than 99% accuracy in the morphological synthesis of Malayalam and Tamil nouns and verbs.

2021 ◽  
Vol 7 (2) ◽  
pp. 133
Author(s):  
Widi Hastomo ◽  
Adhitio Satyo Bayangkari Karno ◽  
Nawang Kalbuana ◽  
Ervina Nisfiani ◽  
Lussiana ETP

Penelitian ini bertujuan untuk meningkatkan akurasi dengan menurunkan tingkat kesalahan prediksi dari 5 data saham blue chip di Indonesia. Dengan cara mengkombinasikan desain 4 hidden layer neural nework menggunakan Long Short Term Memory (LSTM) dan Gated Recurrent Unit (GRU). Dari tiap data saham akan dihasilkan grafik rmse-epoch yang dapat menunjukan kombinasi layer dengan akurasi terbaik, sebagai berikut; (a) BBCA dengan layer LSTM-GRU-LSTM-GRU (RMSE=1120,651, e=15), (b) BBRI dengan layer LSTM-GRU-LSTM-GRU (RMSE =110,331, e=25), (c) INDF dengan layer GRU-GRU-GRU-GRU (RMSE =156,297, e=35 ), (d) ASII dengan layer GRU-GRU-GRU-GRU (RMSE =134,551, e=20 ), (e) TLKM dengan layer GRU-LSTM-GRU-LSTM (RMSE =71,658, e=35 ). Tantangan dalam mengolah data Deep Learning (DL) adalah menentukan nilai parameter epoch untuk menghasilkan prediksi akurasi yang tinggi.


Atmosphere ◽  
2020 ◽  
Vol 11 (4) ◽  
pp. 348 ◽  
Author(s):  
Guang Yang ◽  
HwaMin Lee ◽  
Giyeol Lee

Both long- and short-term exposure to high concentrations of airborne particulate matter (PM) severely affect human health. Many countries now regulate PM concentrations. Early-warning systems based on PM concentration levels are urgently required to allow countermeasures to reduce harm and loss. Previous studies sought to establish accurate, efficient predictive models. Many machine-learning methods are used for air pollution forecasting. The long short-term memory and gated recurrent unit methods, typical deep-learning methods, reliably predict PM levels with some limitations. In this paper, the authors proposed novel hybrid models to combine the strength of two types of deep learning methods. Moreover, the authors compare hybrid deep-learning methods (convolutional neural network (CNN)—long short-term memory (LSTM) and CNN—gated recurrent unit (GRU)) with several stand-alone methods (LSTM, GRU) in terms of predicting PM concentrations in 39 stations in Seoul. Hourly air pollution data and meteorological data from January 2015 to December 2018 was used for these training models. The results of the experiment confirmed that the proposed prediction model could predict the PM concentrations for the next 7 days. Hybrid models outperformed single models in five areas selected randomly with the lowest root mean square error (RMSE) and mean absolute error (MAE) values for both PM10 and PM2.5. The error rate for PM10 prediction in Gangnam with RMSE is 1.688, and MAE is 1.161. For hybrid models, the CNN–GRU better-predicted PM10 for all stations selected, while the CNN–LSTM model performed better on predicting PM2.5.


2019 ◽  
Vol 26 (12) ◽  
pp. 1584-1591 ◽  
Author(s):  
Xue Shi ◽  
Yingping Yi ◽  
Ying Xiong ◽  
Buzhou Tang ◽  
Qingcai Chen ◽  
...  

Abstract Objective Extracting clinical entities and their attributes is a fundamental task of natural language processing (NLP) in the medical domain. This task is typically recognized as 2 sequential subtasks in a pipeline, clinical entity or attribute recognition followed by entity-attribute relation extraction. One problem of pipeline methods is that errors from entity recognition are unavoidably passed to relation extraction. We propose a novel joint deep learning method to recognize clinical entities or attributes and extract entity-attribute relations simultaneously. Materials and Methods The proposed method integrates 2 state-of-the-art methods for named entity recognition and relation extraction, namely bidirectional long short-term memory with conditional random field and bidirectional long short-term memory, into a unified framework. In this method, relation constraints between clinical entities and attributes and weights of the 2 subtasks are also considered simultaneously. We compare the method with other related methods (ie, pipeline methods and other joint deep learning methods) on an existing English corpus from SemEval-2015 and a newly developed Chinese corpus. Results Our proposed method achieves the best F1 of 74.46% on entity recognition and the best F1 of 50.21% on relation extraction on the English corpus, and 89.32% and 88.13% on the Chinese corpora, respectively, which outperform the other methods on both tasks. Conclusions The joint deep learning–based method could improve both entity recognition and relation extraction from clinical text in both English and Chinese, indicating that the approach is promising.


Energies ◽  
2021 ◽  
Vol 14 (20) ◽  
pp. 6599
Author(s):  
Halid Kaplan ◽  
Kambiz Tehrani ◽  
Mo Jamshidi

Diagnosing faults in electric vehicles (EVs) is a great challenge. The purpose of this paper is to demonstrate the detection of faults in an electromechanical conversion chain for conventional or autonomous EVs. The information and data coming from different sensors make it possible for EVs to recover a series of information including currents, voltages, speeds, and so on. This information is processed to detect any faults in the electromechanical conversion chain. The novelty of this study is to develop an architecture for a fault diagnosis model by means of the feature extraction technique. In this regard, the long short-term memory (LSTM) approach for the fault diagnosis is proposed. This approach has been tested for an EV prototype in practice, is superior in accuracy over other fault diagnosis techniques, and is based on machine learning. An EV in an urban context is modeled, and then the fault diagnosis approach is applied based on deep learning architectures. The EV and the fault diagnosis model is simulated in Matlab software. It is also revealed how deep learning contributes to the fault diagnosis of EVs. The simulation and practical results confirm that higher accuracy in the fault diagnosis is obtained by applying the LSTM.


Author(s):  
В.А. Мочалов ◽  
А.В. Мочалова

В работе с помощью глубокого обучения рассматривается прогнозирование значений следующих геомагнитных индексов (ГИ): Dst, Kp, AE и AP. Для прогнозирования используются архитектуры долгой краткосрочной памяти (LSTM) и управляемых рекуррентных блоков (GRU). Для различных ГИ индексов анализируется функция потерь в за-висимости от периодичности исходных данных. Установлено, что чем меньше периодичность исходных данных ГИ тем точнее осуществляется прогноз следующего значения ГИ. Для анализа использовались следую-щие периоды исходных данных ГИ: час, 3 часа, сутки. In this work, with the help of deep learning, predicting the values of the following geomagnetic indices (GI) is considered: Dst, Kp, AE and Ap. For forecasting we use the architectures are long short-term memory (LSTM) and gated recurrent unit (GRU). For various GI indices, the loss function is analyzed depending on the periodicity of the source data. It has been established that forecasting accuracy increases with decreasing periodicity of the initial data of geomagnetic indices. For the analysis, the following periods of the initial GI data were used: hour, 3 hours, day. For the analysis we used hour, 3 hours and day periods of the initial GI source data.


2021 ◽  
Author(s):  
Yassine Touzani ◽  
Khadija Douzi

Abstract Forecasting stock prices is an extremely challenging job considering the high volatility and the number of variables that influence it (political, economical, social, etc.). Predicting the closing price provides useful information and helps the investor to make the right decision. The use of deep learning and more precisely the recurrent neural networks RNNs in stock market forecasting is an increasingly common practice in the literature. The Long Short Term Memory LSTM and Gated Recurrent Unit GRU architectures are among the most widely used types of RNN networks, given their suitability for sequential data. In this paper, we propose a trading strategy designed for the Moroccan stock market, based on two deep learning model: Long Short Term Memory LSTM and Gated Recurrent Unit GRU to predict respectively close price for short and mid term horizon. Decision rules for buying and selling stocks are implemented based on the forecasting given by the two models, then over four three-years periods, we simulate transactions using these decision rules with different parameters for each stock. We only hold stocks that ensure a return greater than a benchmark return over the four periods. Random search is then used to choose one of the available parameters and the performance of the portfolio built from the selected stocks will be tested over a further period. The repetition of this process with a variation of benchmark return makes it possible to select the best possible combination of stock each with the parameters optimized for the decision rules. The proposed strategy produces very promising results and outperform the performance of indices used as benchmarks in the local market.


Author(s):  
Richa Sharma ◽  
Sudha Morwal ◽  
Basant Agarwal

This article presents a neural network-based approach to develop named entity recognition for Hindi text. In this paper, the authors propose a deep learning architecture based on convolutional neural network (CNN) and bi-directional long short-term memory (Bi-LSTM) neural network. Skip-gram approach of word2vec model is used in the proposed model to generate word vectors. In this research work, several deep learning models have been developed and evaluated as baseline systems such as recurrent neural network (RNN), long short-term memory (LSTM), Bi-LSTM. Furthermore, these baseline systems are promoted to a proposed model with the integration of CNN and conditional random field (CRF) layers. After a comparative analysis of results, it is verified that the performance of the proposed model (i.e., Bi-LSTM-CNN-CRF) is impressive. The proposed system achieves 61% precision, 56% recall, and 58% F-measure.


In this article, we have trained neural network based on deep learning architectures to classify images on standard Fashion-MNIST and CIFAR-10 dataset. The various CNN- based classification architecture and RNN-based classification architecture are trained as well as tested on those standard datasets. In CNN architecture, we include CNN with 1, 2 and 3 Convolutional Layer and in RNN architecture, we include Long- Short Term Memory (LSTM) with one and two LSTM layer. Our models show remarkable outcome on the standard benchmark dataset. The tested models like CNN1 show greater accuracy on the MNIST fashion dataset and CNN3, LSTM1 and LSTM2 performed better than other models on the CIFAR-10 dataset.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4884
Author(s):  
Danish Javeed ◽  
Tianhan Gao ◽  
Muhammad Taimoor Khan ◽  
Ijaz Ahmad

The Internet of Things (IoT) has emerged as a new technological world connecting billions of devices. Despite providing several benefits, the heterogeneous nature and the extensive connectivity of the devices make it a target of different cyberattacks that result in data breach and financial loss. There is a severe need to secure the IoT environment from such attacks. In this paper, an SDN-enabled deep-learning-driven framework is proposed for threats detection in an IoT environment. The state-of-the-art Cuda-deep neural network, gated recurrent unit (Cu- DNNGRU), and Cuda-bidirectional long short-term memory (Cu-BLSTM) classifiers are adopted for effective threat detection. We have performed 10 folds cross-validation to show the unbiasedness of results. The up-to-date publicly available CICIDS2018 data set is introduced to train our hybrid model. The achieved accuracy of the proposed scheme is 99.87%, with a recall of 99.96%. Furthermore, we compare the proposed hybrid model with Cuda-Gated Recurrent Unit, Long short term memory (Cu-GRULSTM) and Cuda-Deep Neural Network, Long short term memory (Cu- DNNLSTM), as well as with existing benchmark classifiers. Our proposed mechanism achieves impressive results in terms of accuracy, F1-score, precision, speed efficiency, and other evaluation metrics.


Sign in / Sign up

Export Citation Format

Share Document