Tonal Contour Generation for Isarn Speech Synthesis Using Deep Learning and Sampling-Based F0 Representation

The modeling of fundamental frequency (F0) in speech synthesis is a critical factor affecting the intelligibility and naturalness of synthesized speech. In this paper, we focus on improving the modeling of F0 for Isarn speech synthesis. We propose the F0 model for this based on a recurrent neural network (RNN). Sampled values of F0 are used at the syllable level of continuous Isarn speech combined with their dynamic features to represent supra-segmental properties of the F0 contour. Different architectures of the deep RNNs and different combinations of linguistic features are analyzed to obtain conditions for the best performance. To assess the proposed method, we compared it with several RNN-based baselines. The results of objective and subjective tests indicate that the proposed model significantly outperformed the baseline RNN model that predicts values of F0 at the frame level, and the baseline RNN model that represents the F0 contours of syllables by using discrete cosine transform.

Download Full-text

Automatic Segmentation and Classification of COVID-19 CT Image Using Deep Learning and Multi-Scale Recurrent Neural Network Based Classifier

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2021.3850 ◽

2021 ◽

Vol 11 (10) ◽

pp. 2618-2625

Author(s):

R. T. Subhalakshmi ◽

S. Appavu Alias Balamurugan ◽

S. Sasikala

Keyword(s):

Neural Network ◽

Feature Extraction ◽

Deep Learning ◽

Recurrent Neural Network ◽

Automatic Segmentation ◽

Ct Images ◽

Superior Performance ◽

Multi Scale ◽

Proposed Model ◽

Class Labels

In recent times, the COVID-19 epidemic turn out to be increased in an extreme manner, by the accessibility of an inadequate amount of rapid testing kits. Consequently, it is essential to develop the automated techniques for Covid-19 detection to recognize the existence of disease from the radiological images. The most ordinary symptoms of COVID-19 are sore throat, fever, and dry cough. Symptoms are able to progress to a rigorous type of pneumonia with serious impediment. As medical imaging is not recommended currently in Canada for crucial COVID-19 diagnosis, systems of computer-aided diagnosis might aid in early COVID-19 abnormalities detection and help out to observe the disease progression, reduce mortality rates potentially. In this approach, a deep learning based design for feature extraction and classification is employed for automatic COVID-19 diagnosis from computed tomography (CT) images. The proposed model operates on three main processes based pre-processing, feature extraction, and classification. The proposed design incorporates the fusion of deep features using GoogLe Net models. Finally, Multi-scale Recurrent Neural network (RNN) based classifier is applied for identifying and classifying the test CT images into distinct class labels. The experimental validation of the proposed model takes place using open-source COVID-CT dataset, which comprises a total of 760 CT images. The experimental outcome defined the superior performance with the maximum sensitivity, specificity, and accuracy.

Download Full-text

Hybrid deep learning model using recurrent neural network and gated recurrent unit for heart disease prediction

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i6.pp5467-5476 ◽

2021 ◽

Vol 11 (6) ◽

pp. 5467

Author(s):

Surenthiran Krishnan ◽

Pritheega Magalingam ◽

Roslina Ibrahim

Keyword(s):

Neural Network ◽

Heart Disease ◽

Deep Learning ◽

Recurrent Neural Network ◽

Short Term Memory ◽

Learning Model ◽

Disease Prediction ◽

The Neural Network ◽

Proposed Model ◽

Deep Learning Model

<span>This paper proposes a new hybrid deep learning model for heart disease prediction using recurrent neural network (RNN) with the combination of multiple gated recurrent units (GRU), long short-term memory (LSTM) and Adam optimizer. This proposed model resulted in an outstanding accuracy of 98.6876% which is the highest in the existing model of RNN. The model was developed in Python 3.7 by integrating RNN in multiple GRU that operates in Keras and Tensorflow as the backend for deep learning process, supported by various Python libraries. The recent existing models using RNN have reached an accuracy of 98.23% and deep neural network (DNN) has reached 98.5%. The common drawbacks of the existing models are low accuracy due to the complex build-up of the neural network, high number of neurons with redundancy in the neural network model and imbalance datasets of Cleveland. Experiments were conducted with various customized model, where results showed that the proposed model using RNN and multiple GRU with synthetic minority oversampling technique (SMOTe) has reached the best performance level. This is the highest accuracy result for RNN using Cleveland datasets and much promising for making an early heart disease prediction for the patients.</span>

Download Full-text

Automatic Synthesis Technology of Music Teaching Melodies Based on Recurrent Neural Network

Scientific Programming ◽

10.1155/2021/1704995 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Yingxue Zhang ◽

Zhe Li

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Music Teaching ◽

Acoustic Features ◽

Dynamic Features ◽

Automatic Synthesis ◽

Short Delay ◽

Music Score ◽

Proposed Model ◽

Score Model

Computer music creation boasts broad application prospects. It generally relies on artificial intelligence (AI) and machine learning (ML) to generate the music score that matches the original mono-symbol score model or memorize/recognize the rhythms and beats of the music. However, there are very few music melody synthesis models based on artificial neural networks (ANNs). Some ANN-based models cannot adapt to the transposition invariance of original rhythm training set. To overcome the defect, this paper tries to develop an automatic synthesis technology of music teaching melodies based on recurrent neural network (RNN). Firstly, a strategy was proposed to extract the acoustic features from music melody. Next, the sequence-sequence model was adopted to synthetize general music melodies. After that, an RNN was established to synthetize music melody with singing melody, such as to find the suitable singing segments for the music melody in teaching scenario. The RNN can synthetize music melody with a short delay solely based on static acoustic features, eliminating the need for dynamic features. The proposed model was proved valid through experiments.

Download Full-text

Audio-Based Drone Detection and Identification Using Deep Learning Techniques with Dataset Enhancement through Generative Adversarial Networks

Sensors ◽

10.3390/s21154953 ◽

2021 ◽

Vol 21 (15) ◽

pp. 4953

Author(s):

Sara Al-Emadi ◽

Abdulla Al-Ali ◽

Abdulaziz Al-Ali

Keyword(s):

Neural Network ◽

Deep Learning ◽

Recurrent Neural Network ◽

Learning Algorithms ◽

Generative Adversarial Networks ◽

Generative Adversarial Network ◽

Adversarial Networks ◽

Detection And Identification ◽

Learning Techniques ◽

The Impact

Drones are becoming increasingly popular not only for recreational purposes but in day-to-day applications in engineering, medicine, logistics, security and others. In addition to their useful applications, an alarming concern in regard to the physical infrastructure security, safety and privacy has arisen due to the potential of their use in malicious activities. To address this problem, we propose a novel solution that automates the drone detection and identification processes using a drone’s acoustic features with different deep learning algorithms. However, the lack of acoustic drone datasets hinders the ability to implement an effective solution. In this paper, we aim to fill this gap by introducing a hybrid drone acoustic dataset composed of recorded drone audio clips and artificially generated drone audio samples using a state-of-the-art deep learning technique known as the Generative Adversarial Network. Furthermore, we examine the effectiveness of using drone audio with different deep learning algorithms, namely, the Convolutional Neural Network, the Recurrent Neural Network and the Convolutional Recurrent Neural Network in drone detection and identification. Moreover, we investigate the impact of our proposed hybrid dataset in drone detection. Our findings prove the advantage of using deep learning techniques for drone detection and identification while confirming our hypothesis on the benefits of using the Generative Adversarial Networks to generate real-like drone audio clips with an aim of enhancing the detection of new and unfamiliar drones.

Download Full-text

A Review of Plant Phenotypic Image Recognition Technology Based on Deep Learning

Electronics ◽

10.3390/electronics10010081 ◽

2021 ◽

Vol 10 (1) ◽

pp. 81

Author(s):

Jianbin Xiong ◽

Dezheng Yu ◽

Shuangyin Liu ◽

Lei Shu ◽

Xiaochan Wang ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Plant Species ◽

Image Recognition ◽

Recurrent Neural Network ◽

Plant Diseases ◽

Learning Methods ◽

Smart Agriculture ◽

Important Branch

Plant phenotypic image recognition (PPIR) is an important branch of smart agriculture. In recent years, deep learning has achieved significant breakthroughs in image recognition. Consequently, PPIR technology that is based on deep learning is becoming increasingly popular. First, this paper introduces the development and application of PPIR technology, followed by its classification and analysis. Second, it presents the theory of four types of deep learning methods and their applications in PPIR. These methods include the convolutional neural network, deep belief network, recurrent neural network, and stacked autoencoder, and they are applied to identify plant species, diagnose plant diseases, etc. Finally, the difficulties and challenges of deep learning in PPIR are discussed.

Download Full-text

Detection of Malicious Software by Analyzing Distinct Artifacts Using Machine Learning and Deep Learning Algorithms

Electronics ◽

10.3390/electronics10141694 ◽

2021 ◽

Vol 10 (14) ◽

pp. 1694

Author(s):

Mathew Ashik ◽

A. Jyothish ◽

S. Anandaram ◽

P. Vinod ◽

Francesco Mercaldo ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Support Vector ◽

Malware Analysis ◽

Learning Approaches ◽

Dynamic Features ◽

System Calls ◽

Prevention Methods ◽

Structural Aspects

Malware is one of the most significant threats in today’s computing world since the number of websites distributing malware is increasing at a rapid rate. Malware analysis and prevention methods are increasingly becoming necessary for computer systems connected to the Internet. This software exploits the system’s vulnerabilities to steal valuable information without the user’s knowledge, and stealthily send it to remote servers controlled by attackers. Traditionally, anti-malware products use signatures for detecting known malware. However, the signature-based method does not scale in detecting obfuscated and packed malware. Considering that the cause of a problem is often best understood by studying the structural aspects of a program like the mnemonics, instruction opcode, API Call, etc. In this paper, we investigate the relevance of the features of unpacked malicious and benign executables like mnemonics, instruction opcodes, and API to identify a feature that classifies the executable. Prominent features are extracted using Minimum Redundancy and Maximum Relevance (mRMR) and Analysis of Variance (ANOVA). Experiments were conducted on four datasets using machine learning and deep learning approaches such as Support Vector Machine (SVM), Naïve Bayes, J48, Random Forest (RF), and XGBoost. In addition, we also evaluate the performance of the collection of deep neural networks like Deep Dense network, One-Dimensional Convolutional Neural Network (1D-CNN), and CNN-LSTM in classifying unknown samples, and we observed promising results using APIs and system calls. On combining APIs/system calls with static features, a marginal performance improvement was attained comparing models trained only on dynamic features. Moreover, to improve accuracy, we implemented our solution using distinct deep learning methods and demonstrated a fine-tuned deep neural network that resulted in an F1-score of 99.1% and 98.48% on Dataset-2 and Dataset-3, respectively.

Download Full-text

Recurrent neural networks with long term temporal dependencies in machine tool wear diagnosis and prognosis

SN Applied Sciences ◽

10.1007/s42452-021-04427-5 ◽

2021 ◽

Vol 3 (4) ◽

Author(s):

Jianlei Zhang ◽

Yukun Zeng ◽

Binil Starly

Keyword(s):

Neural Network ◽

Tool Wear ◽

Machine Tool ◽

Recurrent Neural Network ◽

Machine Tools ◽

Prediction Performance ◽

Sequential Data ◽

Diagnosis And Prognosis ◽

Proposed Model

AbstractData-driven approaches for machine tool wear diagnosis and prognosis are gaining attention in the past few years. The goal of our study is to advance the adaptability, flexibility, prediction performance, and prediction horizon for online monitoring and prediction. This paper proposes the use of a recent deep learning method, based on Gated Recurrent Neural Network architecture, including Long Short Term Memory (LSTM), which try to captures long-term dependencies than regular Recurrent Neural Network method for modeling sequential data, and also the mechanism to realize the online diagnosis and prognosis and remaining useful life (RUL) prediction with indirect measurement collected during the manufacturing process. Existing models are usually tool-specific and can hardly be generalized to other scenarios such as for different tools or operating environments. Different from current methods, the proposed model requires no prior knowledge about the system and thus can be generalized to different scenarios and machine tools. With inherent memory units, the proposed model can also capture long-term dependencies while learning from sequential data such as those collected by condition monitoring sensors, which means it can be accommodated to machine tools with varying life and increase the prediction performance. To prove the validity of the proposed approach, we conducted multiple experiments on a milling machine cutting tool and applied the model for online diagnosis and RUL prediction. Without loss of generality, we incorporate a system transition function and system observation function into the neural net and trained it with signal data from a minimally intrusive vibration sensor. The experiment results showed that our LSTM-based model achieved the best overall accuracy among other methods, with a minimal Mean Square Error (MSE) for tool wear prediction and RUL prediction respectively.

Download Full-text

Ensemble Empirical Mode Decomposition with Adaptive Noise with Convolution Based Gated Recurrent Neural Network: A New Deep Learning Model for South Asian High Intensity Forecasting

Symmetry ◽

10.3390/sym13060931 ◽

2021 ◽

Vol 13 (6) ◽

pp. 931

Author(s):

Kecheng Peng ◽

Xiaoqun Cao ◽

Bainian Liu ◽

Yanan Guo ◽

Wenlong Tian

Keyword(s):

Neural Network ◽

Time Series ◽

Deep Learning ◽

South Asian ◽

Recurrent Neural Network ◽

Ensemble Empirical Mode Decomposition ◽

South Asian High ◽

Mode Decomposition ◽

Adaptive Noise ◽

Deep Learning Model

The intensity variation of the South Asian high (SAH) plays an important role in the formation and extinction of many kinds of mesoscale systems, including tropical cyclones, southwest vortices in the Asian summer monsoon (ASM) region, and the precipitation in the whole Asia Europe region, and the SAH has a vortex symmetrical structure; its dynamic field also has the symmetry form. Not enough previous studies focus on the variation of SAH daily intensity. The purpose of this study is to establish a day-to-day prediction model of the SAH intensity, which can accurately predict not only the interannual variation but also the day-to-day variation of the SAH. Focusing on the summer period when the SAH is the strongest, this paper selects the geopotential height data between 1948 and 2020 from NCEP to construct the SAH intensity datasets. Compared with the classical deep learning methods of various kinds of efficient time series prediction model, we ultimately combine the Ensemble Empirical Mode Decomposition with Adaptive Noise (CEEMDAN) method, which has the ability to deal with the nonlinear and unstable single system, with the Permutation Entropy (PE) method, which can extract the SAH intensity feature of IMF decomposed by CEEMDAN, and the Convolution-based Gated Recurrent Neural Network (ConvGRU) model is used to train, test, and predict the intensity of the SAH. The prediction results show that the combination of CEEMDAN and ConvGRU can have a higher accuracy and more stable prediction ability than the traditional deep learning model. After removing the redundant features in the time series, the prediction accuracy of the SAH intensity is higher than that of the classical model, which proves that the method has good applicability for the prediction of nonlinear systems in the atmosphere.

Download Full-text

Developing an Individual Glucose Prediction Model Using Recurrent Neural Network

Sensors ◽

10.3390/s20226460 ◽

2020 ◽

Vol 20 (22) ◽

pp. 6460

Author(s):

Dae-Yeon Kim ◽

Dong-Sik Choi ◽

Jaeyun Kim ◽

Sung Wan Chun ◽

Hyo-Wook Gil ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Prediction Model ◽

Glucose Level ◽

Recurrent Neural Network ◽

Learning Algorithm ◽

Percentage Error ◽

Glucose Prediction

In this study, we propose a personalized glucose prediction model using deep learning for hospitalized patients who experience Type-2 diabetes. We aim for our model to assist the medical personnel who check the blood glucose and control the amount of insulin doses. Herein, we employed a deep learning algorithm, especially a recurrent neural network (RNN), that consists of a sequence processing layer and a classification layer for the glucose prediction. We tested a simple RNN, gated recurrent unit (GRU), and long-short term memory (LSTM) and varied the architectures to determine the one with the best performance. For that, we collected data for a week using a continuous glucose monitoring device. Type-2 inpatients are usually experiencing bad health conditions and have a high variability of glucose level. However, there are few studies on the Type-2 glucose prediction model while many studies performed on Type-1 glucose prediction. This work has a contribution in that the proposed model exhibits a comparative performance to previous works on Type-1 patients. For 20 in-hospital patients, we achieved an average root mean squared error (RMSE) of 21.5 and an Mean absolute percentage error (MAPE) of 11.1%. The GRU with a single RNN layer and two dense layers was found to be sufficient to predict the glucose level. Moreover, to build a personalized model, at most, 50% of data are required for training.

Download Full-text

An investigation of recurrent neural network architectures for statistical parametric speech synthesis

10.21437/interspeech.2015-266 ◽

2015 ◽

Author(s):

Sivanand Achanta ◽

Tejas Godambe ◽

Suryakanth V. Gangashetty

Keyword(s):

Neural Network ◽

Recurrent Neural Network ◽

Speech Synthesis ◽

Network Architectures ◽

Statistical Parametric Speech Synthesis ◽

Parametric Speech Synthesis ◽

Neural Network Architectures

Download Full-text