Deep Sparse Learning for Automatic Modulation Classification Using Recurrent Neural Networks

Deep learning models, especially recurrent neural networks (RNNs), have been successfully applied to automatic modulation classification (AMC) problems recently. However, deep neural networks are usually overparameterized, i.e., most of the connections between neurons are redundant. The large model size hinders the deployment of deep neural networks in applications such as Internet-of-Things (IoT) networks. Therefore, reducing parameters without compromising the network performance via sparse learning is often desirable since it can alleviates the computational and storage burdens of deep learning models. In this paper, we propose a sparse learning algorithm that can directly train a sparsely connected neural network based on the statistics of weight magnitude and gradient momentum. We first used the MNIST and CIFAR10 datasets to demonstrate the effectiveness of this method. Subsequently, we applied it to RNNs with different pruning strategies on recurrent and non-recurrent connections for AMC problems. Experimental results demonstrated that the proposed method can effectively reduce the parameters of the neural networks while maintaining model performance. Moreover, we show that appropriate sparsity can further improve network generalization ability.

Download Full-text

Examining Attention Mechanisms in Deep Learning Models for Sentiment Analysis

Applied Sciences ◽

10.3390/app11093883 ◽

2021 ◽

Vol 11 (9) ◽

pp. 3883

Author(s):

Spyridon Kardakis ◽

Isidoros Perikos ◽

Foteini Grivokostopoulou ◽

Ioannis Hatzilygeroudis

Keyword(s):

Neural Networks ◽

Experimental Study ◽

Deep Learning ◽

Comparative Analysis ◽

Sentiment Analysis ◽

Recurrent Neural Networks ◽

Deep Neural Networks ◽

Training Methods ◽

Learning Models ◽

Neural Models

Attention-based methods for deep neural networks constitute a technique that has attracted increased interest in recent years. Attention mechanisms can focus on important parts of a sequence and, as a result, enhance the performance of neural networks in a variety of tasks, including sentiment analysis, emotion recognition, machine translation and speech recognition. In this work, we study attention-based models built on recurrent neural networks (RNNs) and examine their performance in various contexts of sentiment analysis. Self-attention, global-attention and hierarchical-attention methods are examined under various deep neural models, training methods and hyperparameters. Even though attention mechanisms are a powerful recent concept in the field of deep learning, their exact effectiveness in sentiment analysis is yet to be thoroughly assessed. A comparative analysis is performed in a text sentiment classification task where baseline models are compared with and without the use of attention for every experiment. The experimental study additionally examines the proposed models’ ability in recognizing opinions and emotions in movie reviews. The results indicate that attention-based models lead to great improvements in the performance of deep neural models showcasing up to a 3.5% improvement in their accuracy.

Download Full-text

Levenshtein Augmentation Improves Performance of SMILES Based Deep-Learning Synthesis Prediction

10.26434/chemrxiv.12562121 ◽

2020 ◽

Author(s):

Dean Sumner ◽

Jiazhen He ◽

Amol Thakkar ◽

Ola Engkvist ◽

Esben Jannik Bjerrum

Keyword(s):

Neural Networks ◽

Pattern Recognition ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Data Augmentation ◽

State Of The Art ◽

Sequence Similarity ◽

Learning Models ◽

Underlying Network

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>

Download Full-text

Enabling deeper learning on big data for materials informatics applications

Scientific Reports ◽

10.1038/s41598-021-83193-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Dipendra Jha ◽

Vishu Gupta ◽

Logan Ward ◽

Zijiang Yang ◽

Christopher Wolverton ◽

...

Keyword(s):

Neural Networks ◽

Big Data ◽

Deep Learning ◽

Deep Neural Networks ◽

Materials Science ◽

Prediction Models ◽

Model Performance ◽

Materials Informatics ◽

Learning Framework ◽

Significant Attention

AbstractThe application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.

Download Full-text

A Review of Recent Deep Learning Models in COVID-19 Diagnosis

European Journal of Engineering and Technology Research ◽

10.24018/ejers.2021.6.5.2485 ◽

2021 ◽

Vol 6 (5) ◽

pp. 10-15

Author(s):

Ela Bhattacharya ◽

D. Bhattacharya

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Test Results ◽

Learning Models ◽

Future Directions ◽

Human Contact ◽

The World ◽

Short Span

COVID-19 has emerged as the latest worrisome pandemic, which is reported to have its outbreak in Wuhan, China. The infection spreads by means of human contact, as a result, it has caused massive infections across 200 countries around the world. Artificial intelligence has likewise contributed to managing the COVID-19 pandemic in various aspects within a short span of time. Deep Neural Networks that are explored in this paper have contributed to the detection of COVID-19 from imaging sources. The datasets, pre-processing, segmentation, feature extraction, classification and test results which can be useful for discovering future directions in the domain of automatic diagnosis of the disease, utilizing artificial intelligence-based frameworks, have been investigated in this paper.

Download Full-text

Automatic modulation classification using recurrent neural networks

2017 3rd IEEE International Conference on Computer and Communications (ICCC) ◽

10.1109/compcomm.2017.8322633 ◽

2017 ◽

Cited By ~ 9

Author(s):

Dehua Hong ◽

Zilong Zhang ◽

Xiaodong Xu

Keyword(s):

Neural Networks ◽

Recurrent Neural Networks ◽

Modulation Classification ◽

Automatic Modulation Classification

Download Full-text

Image Compression Based on Deep Learning: A Review

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v8i130193 ◽

2021 ◽

pp. 62-76

Author(s):

Hajar Maseeh Yasin ◽

Adnan Mohsin Abdulazeez

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Image Compression ◽

Recurrent Neural Networks ◽

Deep Neural Networks ◽

Review Paper ◽

High Accuracy ◽

Machine Learning Methods ◽

Digital Era ◽

Different Types

Image compression is an essential technology for encoding and improving various forms of images in the digital era. The inventors have extended the principle of deep learning to the different states of neural networks as one of the most exciting machine learning methods to show that it is the most versatile way to analyze, classify, and compress images. Many neural networks are required for image compressions, such as deep neural networks, artificial neural networks, recurrent neural networks, and convolution neural networks. Therefore, this review paper discussed how to apply the rule of deep learning to various neural networks to obtain better compression in the image with high accuracy and minimize loss and superior visibility of the image. Therefore, deep learning and its application to different types of images in a justified manner with distinct analysis to obtain these things need deep learning.

Download Full-text

Graph Universal Adversarial Attacks: A Few Bad Actors Ruin Graph Learning Models

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/458 ◽

2021 ◽

Author(s):

Xiao Zang ◽

Yi Xie ◽

Jie Chen ◽

Bo Yuan

Keyword(s):

Neural Networks ◽

Success Rate ◽

Deep Neural Networks ◽

Model Performance ◽

Graph Model ◽

Structured Data ◽

Graph Structure ◽

Learning Models ◽

Security Threat ◽

Anchor Nodes

Deep neural networks, while generalize well, are known to be sensitive to small adversarial perturbations. This phenomenon poses severe security threat and calls for in-depth investigation of the robustness of deep learning models. With the emergence of neural networks for graph structured data, similar investigations are urged to understand their robustness. It has been found that adversarially perturbing the graph structure and/or node features may result in a significant degradation of the model performance. In this work, we show from a different angle that such fragility similarly occurs if the graph contains a few bad-actor nodes, which compromise a trained graph neural network through flipping the connections to any targeted victim. Worse, the bad actors found for one graph model severely compromise other models as well. We call the bad actors ``anchor nodes'' and propose an algorithm, named GUA, to identify them. Thorough empirical investigations suggest an interesting finding that the anchor nodes often belong to the same class; and they also corroborate the intuitive trade-off between the number of anchor nodes and the attack success rate. For the dataset Cora which contains 2708 nodes, as few as six anchor nodes will result in an attack success rate higher than 80% for GCN and other three models.

Download Full-text

PM2.5 Prediction Model Based on Combinational Hammerstein Recurrent Neural Networks

Mathematics ◽

10.3390/math8122178 ◽

2020 ◽

Vol 8 (12) ◽

pp. 2178

Author(s):

Yi-Chung Chen ◽

Tsu-Chiang Lei ◽

Shun Yao ◽

Hsin-Ping Wang

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Recurrent Neural Networks ◽

Predictive Accuracy ◽

Learning Models ◽

Short Term ◽

Computational Overhead ◽

Implementation Costs ◽

External Sources

Airborne particulate matter 2.5 (PM2.5) can have a profound effect on the health of the population. Many researchers have been reporting highly accurate numerical predictions based on raw PM2.5 data imported directly into deep learning models; however, there is still considerable room for improvement in terms of implementation costs due to heavy computational overhead. From the perspective of environmental science, PM2.5 values in a given location can be attributed to local sources as well as external sources. Local sources tend to have a dramatic short-term impact on PM2.5 values, whereas external sources tend to have more subtle but longer-lasting effects. In the presence of PM2.5 from both sources at the same time, this combination of effects can undermine the predictive accuracy of the model. This paper presents a novel combinational Hammerstein recurrent neural network (CHRNN) to enhance predictive accuracy and overcome the heavy computational and monetary burden imposed by deep learning models. The CHRNN comprises a based-neural network tasked with learning gradual (long-term) fluctuations in conjunction with add-on neural networks to deal with dramatic (short-term) fluctuations. The CHRNN can be coupled with a random forest model to determine the degree to which short-term effects influence long-term outcomes. We also developed novel feature selection and normalization methods to enhance prediction accuracy. Using real-world measurement data of air quality and PM2.5 datasets from Taiwan, the precision of the proposed system in the numerical prediction of PM2.5 levels was comparable to that of state-of-the-art deep learning models, such as deep recurrent neural networks and long short-term memory, despite far lower implementation costs and computational overhead.

Download Full-text

P1923Deep and machine learning models to improve risk prediction of cardiovascular disease using data extraction from electronic health records

European Heart Journal ◽

10.1093/eurheartj/ehz748.0670 ◽

2019 ◽

Vol 40 (Supplement_1) ◽

Author(s):

I Korsakov ◽

A Gusev ◽

T Kuznetsova ◽

D Gavrilov ◽

R Novitskiy

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Cardiovascular Disease ◽

Logistic Regression ◽

Deep Learning ◽

Risk Prediction ◽

Learning Algorithm ◽

Learning Models ◽

Electronic Health ◽

Machine Learning Models

Abstract Abstract Background Advances in precision medicine will require an increasingly individualized prognostic evaluation of patients in order to provide the patient with appropriate therapy. The traditional statistical methods of predictive modeling, such as SCORE, PROCAM, and Framingham, according to the European guidelines for the prevention of cardiovascular disease, not adapted for all patients and require significant human involvement in the selection of predictive variables, transformation and imputation of variables. In ROC-analysis for prediction of significant cardiovascular disease (CVD), the areas under the curve for Framingham: 0.62–0.72, for SCORE: 0.66–0.73 and for PROCAM: 0.60–0.69. To improve it, we apply for approaches to predict a CVD event rely on conventional risk factors by machine learning and deep learning models to 10-year CVD event prediction by using longitudinal electronic health record (EHR). Methods For machine learning, we applied logistic regression (LR) and recurrent neural networks with long short-term memory (LSTM) units as a deep learning algorithm. We extract from longitudinal EHR the following features: demographic, vital signs, diagnoses (ICD-10-cm: I21-I22.9: I61-I63.9) and medication. The problem in this step, that near 80 percent of clinical information in EHR is “unstructured” and contains errors and typos. Missing data are important for the correct training process using by deep learning & machine learning algorithm. The study cohort included patients between the ages of 21 to 75 with a dynamic observation window. In total, we got 31517 individuals in the dataset, but only 3652 individuals have all features or missing features values can be easy to impute. Among these 3652 individuals, 29.4% has a CVD, mean age 49.4 years, 68,2% female. Evaluation We randomly divided the dataset into a training and a test set with an 80/20 split. The LR was implemented with Python Scikit-Learn and the LSTM model was implemented with Keras using Tensorflow as the backend. Results We applied machine learning and deep learning models using the same features as traditional risk scale and longitudinal EHR features for CVD prediction, respectively. Machine learning model (LR) achieved an AUROC of 0.74–0.76 and deep learning (LSTM) 0.75–0.76. By using features from EHR logistic regression and deep learning models improved the AUROC to 0.78–0.79. Conclusion The machine learning models outperformed a traditional clinically-used predictive model for CVD risk prediction (i.e. SCORE, PROCAM, and Framingham equations). This approach was used to create a clinical decision support system (CDSS). It uses both traditional risk scales and models based on neural networks. Especially important is the fact that the system can calculate the risks of cardiovascular disease automatically and recalculate immediately after adding new information to the EHR. The results are delivered to the user's personal account.

Download Full-text

Identification of Thoracic Diseases by Exploiting Deep Neural Networks (Preprint)

10.2196/preprints.23644 ◽

2020 ◽

Author(s):

Albahli Saleh ◽

Ali Alkhalifah

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Deep Neural Networks ◽

Medical Image Analysis ◽

Medical Community ◽

Learning Models ◽

X Ray ◽

Chest Disease

BACKGROUND To diagnose cardiothoracic diseases, a chest x-ray (CXR) is examined by a radiologist. As more people get affected, doctors are becoming scarce especially in developing countries. However, with the advent of image processing tools, the task of diagnosing these cardiothoracic diseases has seen great progress. A lot of researchers have put in work to see how the problems associated with medical images can be mitigated by using neural networks. OBJECTIVE Previous works used state-of-the-art techniques and got effective results with one or two cardiothoracic diseases but could lead to misclassification. In our work, we adopted GANs to synthesize the chest radiograph (CXR) to augment the training set on multiple cardiothoracic diseases to efficiently diagnose the chest diseases in different classes as shown in Figure 1. In this regard, our major contributions are classifying various cardiothoracic diseases to detect a specific chest disease based on CXR, use the advantage of GANs to overcome the shortages of small training datasets, address the problem of imbalanced data; and implementing optimal deep neural network architecture with different hyper-parameters to improve the model with the best accuracy. METHODS For this research, we are not building a model from scratch due to computational restraints as they require very high-end computers. Rather, we use a Convolutional Neural Network (CNN) as a class of deep neural networks to propose a generative adversarial network (GAN) -based model to generate synthetic data for training the data as the amount of the data is limited. We will use pre-trained models which are models that were trained on a large benchmark dataset to solve a problem similar to the one we want to solve. For example, the ResNet-152 model we used was initially trained on the ImageNet dataset. RESULTS After successful training and validation of the models we developed, ResNet-152 with image augmentation proved to be the best model for the automatic detection of cardiothoracic disease. However, one of the main problems associated with radiographic deep learning projects and research is the scarcity and unavailability of enough datasets which is a key component of all deep learning models as they require a lot of data for training. This is the reason why some of our models had image augmentation to increase the number of images without duplication. As more data are collected in the field of chest radiology, the models could be retrained to improve the accuracies of the models as deep learning models improve with more data. CONCLUSIONS This research employs the advantages of computer vision and medical image analysis to develop an automated model that has the clinical potential for early detection of the disease. Using deep learning models, the research aims to evaluate the effectiveness and accuracy of different convolutional neural network models in the automatic diagnosis of cardiothoracic diseases from x-ray images compared to diagnosis by experts in the medical community.

Download Full-text