Design of hybrid neural networks of the ensemble structure

This paper considers the structural-parametric synthesis (SPS) of neural networks (NNs) of deep learning, in particular convolutional neural networks (CNNs), which are used in image processing. It has been shown that modern neural networks may possess a variety of topologies. That is ensured by using unique blocks that determine their essential features, namely, the compression and excitation unit, the attention module convolution unit, the channel attention module, the spatial attention module, the residual unit, the ResNeXt block. This, first of all, is due to the need to increase their efficiency in the processing of images. Due to the large architectural space of parameters, including the type of unique block, the location in the structure of the convolutional neural network, its connections with other blocks, layers, computing costs grow nonlinearly. To minimize computational costs while maintaining the specified accuracy this work set tasks of both the generation of possible topology and structural-parametric synthesis of convolutional neural networks. To resolve them, the use of a genetic algorithm (GA) has been proposed. Parameter configuration was implemented using a genetic algorithm and modern gradient methods (GM). For example, stochastic gradient descent with momentum, accelerated Nesterov gradient, adaptive gradient algorithm, distribution of the root of the mean square of the gradient, assessment of adaptive momentum, adaptive Nesterov momentum. It is assumed to use such networks in the intelligent medical diagnostic system (IMDS), for determining the activity of tuberculosis. To improve the accuracy of solving the classification problem in the processing of images, the ensemble structure of hybrid convolutional neural networks (HCNNs) has been proposed in the current work. The parallel structure of the ensemble with the merged layer was used. Algorithms of optimal choice and integration of features in the construction of the ensemble have been developed

Download Full-text

Structural-parametric synthesis of deep learning neural networks

Artificial Intelligence ◽

10.15407/jai2020.04.042 ◽

2020 ◽

Vol 25 (4) ◽

pp. 42-51

Author(s):

Sineglazov V.M. ◽

◽

Chumachenko O.I. ◽

Keyword(s):

Neural Network ◽

Genetic Algorithm ◽

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

Search Space ◽

Training Sample ◽

Parametric Synthesis ◽

Encoding Method

The structural-parametric synthesis of neural networks of deep learning, in particular convolutional neural networks used in image processing, is considered. The classification of modern architectures of convolutional neural networks is given. It is shown that almost every convolutional neural network, depending on its topology, has unique blocks that determine its essential features (for example, Squeeze and Excitation Block, Convolutional Block of Attention Module (Channel attention module, Spatial attention module), Residual block, Inception module, ResNeXt block. It is stated the problem of structural-parametric synthesis of convolutional neural networks, for the solution of which it is proposed to use a genetic algorithm. The genetic algorithm is used to effectively overcome a large search space: on the one hand, to generate possible topologies of the convolutional neural network, namely the choice of specific blocks and their locations in the structure of the convolutional neural network, and on the other hand to solve the problem of structural-parametric synthesis of convolutional neural network of selected topology. The most significant parameters of the convolutional neural network are determined. An encoding method is proposed that allows to repre- sent each network structure in the form of a string of fixed length in binary format. After that, several standard genetic operations were identified, i.e. selection, mutation and crossover, which eliminate weak individuals of the previous generation and use them to generate competitive ones. An example of solving this problem is given, a database (ultrasound results) of patients with thyroid disease was used as a training sample.

Download Full-text

Handwritten Devanagari Character Recognition Using Layer-Wise Training of Deep Convolutional Neural Networks and Adaptive Gradient Methods

Journal of Imaging ◽

10.3390/jimaging4020041 ◽

2018 ◽

Vol 4 (2) ◽

pp. 41 ◽

Cited By ~ 16

Author(s):

Mahesh Jangid ◽

Sumit Srivastava

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Character Recognition ◽

Gradient Methods ◽

Deep Convolutional Neural Networks

Download Full-text

Layer-Wise Compressive Training for Convolutional Neural Networks

Future Internet ◽

10.3390/fi11010007 ◽

2018 ◽

Vol 11 (1) ◽

pp. 7 ◽

Cited By ~ 3

Author(s):

Matteo Grimaldi ◽

Valerio Tenace ◽

Andrea Calimera

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Gradient Descent ◽

Computational Models ◽

Stochastic Gradient Descent ◽

Training Algorithm ◽

Heuristic Rules ◽

Human Capabilities ◽

Model Size ◽

Large Model

Convolutional Neural Networks (CNNs) are brain-inspired computational models designed to recognize patterns. Recent advances demonstrate that CNNs are able to achieve, and often exceed, human capabilities in many application domains. Made of several millions of parameters, even the simplest CNN shows large model size. This characteristic is a serious concern for the deployment on resource-constrained embedded-systems, where compression stages are needed to meet the stringent hardware constraints. In this paper, we introduce a novel accuracy-driven compressive training algorithm. It consists of a two-stage flow: first, layers are sorted by means of heuristic rules according to their significance; second, a modified stochastic gradient descent optimization is applied on less significant layers such that their representation is collapsed into a constrained subspace. Experimental results demonstrate that our approach achieves remarkable compression rates with low accuracy loss (<1%).

Download Full-text

Plant Diseases Identification through a Discount Momentum Optimizer in Deep Learning

Applied Sciences ◽

10.3390/app11209468 ◽

2021 ◽

Vol 11 (20) ◽

pp. 9468

Author(s):

Yunyun Sun ◽

Yutong Liu ◽

Haocheng Zhou ◽

Huijuan Hu

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Adaptive Learning ◽

Learning Rate ◽

Plant Diseases ◽

Stochastic Gradient Descent ◽

Automatic Identification ◽

Deep Convolutional Neural Networks ◽

Adaptive Learning Rate

Deep learning proves its promising results in various domains. The automatic identification of plant diseases with deep convolutional neural networks attracts a lot of attention at present. This article extends stochastic gradient descent momentum optimizer and presents a discount momentum (DM) deep learning optimizer for plant diseases identification. To examine the recognition and generalization capability of the DM optimizer, we discuss the hyper-parameter tuning and convolutional neural networks models across the plantvillage dataset. We further conduct comparison experiments on popular non-adaptive learning rate methods. The proposed approach achieves an average validation accuracy of no less than 97% for plant diseases prediction on several state-of-the-art deep learning models and holds a low sensitivity to hyper-parameter settings. Experimental results demonstrate that the DM method can bring a higher identification performance, while still maintaining a competitive performance over other non-adaptive learning rate methods in terms of both training speed and generalization.

Download Full-text

Convolutional neural networks and genetic algorithm for visual imagery classification

Physical and Engineering Sciences in Medicine ◽

10.1007/s13246-020-00894-z ◽

2020 ◽

Vol 43 (3) ◽

pp. 973-983

Author(s):

Fabio R. Llorella ◽

Gustavo Patow ◽

José M. Azorín

Keyword(s):

Genetic Algorithm ◽

Neural Networks ◽

Convolutional Neural Networks ◽

Visual Imagery

Download Full-text

A Unit Softmax with Laplacian Smoothing Stochastic Gradient Descent for Deep Convolutional Neural Networks

Communications in Computer and Information Science - Intelligent Technologies and Applications ◽

10.1007/978-981-15-5232-8_14 ◽

2020 ◽

pp. 162-174

Author(s):

Jamshaid Ul Rahman ◽

Akhtar Ali ◽

Masood Ur Rehman ◽

Rafaqat Kazmi

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Gradient Descent ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Deep Convolutional Neural Networks ◽

Laplacian Smoothing

Download Full-text

Using convolutional neural networks to detect giant landslides in the Patagonian Andean foreland

10.5194/egusphere-egu21-2728 ◽

2021 ◽

Author(s):

Elisabeth Schönfeldt ◽

Pánek Tomáš ◽

Winocur Diego ◽

Korup Oliver

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Lateral Spreading ◽

Test Methods ◽

Tierra Del Fuego ◽

Stochastic Gradient Descent ◽

Optical Data ◽

Large Area ◽

Training Images ◽

Geomorphic Features

<p>The Andean foreland of Patagonia features dozens of basaltic plateaus that are spread out from the Argentinean province of Neuqu&#233;n south to Tierra del Fuego. The plateau margins are undermined by numerous giant slope failures that mostly involved a combination of lateral spreading and rotational sliding, running out up to several kilometres along the plateau margins. However, the overall extent of plateau margins affected by landsliding is still unknown, because manual mapping of such a large area (~500.000 km&#178;) is time-consuming. Therefore, our goal is to test methods that support manual mapping by an automatic and objective detection of giant landslides. All of these landslides share very similar topographic features such as subparallel compression ridges and elongate depressions, distinguishing them in terms of their topographic and optical appearance from surrounding areas (e.g. plains or plateau tops). Using a catalogue of these features, we tested an image classification scheme using convolutional neural networks (CNNs). Our input data consist of Sentinel-2 optical data (20-m resolution) and topographic factors (surface roughness and curvature) acquired from TanDEM-X data (12-m resolution). We applied transfer learning, modifying the pre-existing CNN alexnet to test how well it is able to distinguish different geomorphic features such as unstable terrain from plateau tops or plains. Over 4000 training images were extracted from the Meseta Somuncur&#225;, while the trained algorithm was tested at the Sierra Cuadrada. Both plateaus are part of the Northern Patagonia Massif. Preliminary results show that the modified algorithms performs reasonable and is able to distinguish between giant landslides and other geomorphic features. However, performance strongly depends on the training options of the stochastic gradient descent within the CNN and image quality of the training images, especially the quantity of images and their extracted location with respect to the plateau margin.</p>

Download Full-text

Identification of post-stroke EEG signal using wavelet and convolutional neural networks

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v9i5.2005 ◽

2020 ◽

Vol 9 (5) ◽

pp. 1890-1898 ◽

Cited By ~ 4

Author(s):

Esmeralda C. Djamal ◽

Rizkia I. Ramadhan ◽

Miranti I. Mandasari ◽

Deswara Djajasasmita

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Stochastic Gradient Descent ◽

Eeg Signal ◽

Stroke Patients ◽

Post Stroke ◽

Moment Estimation ◽

Testing Data ◽

The Brain ◽

Eeg Signal Processing

Post-stroke patients need ongoing rehabilitation to restore dysfunction caused by an attack so that a monitoring device is required. EEG signals reflect electrical activity in the brain, which also informs the condition of post-stroke patient recovery. However, the EEG signal processing model needs to provide information on the post-stroke state. The development of deep learning allows it to be applied to the identification of post-stroke patients. This study proposed a method for identifying post-stroke patients using convolutional neural networks (CNN). Wavelet is used for EEG signal information extraction as a feature of machine learning, which reflects the condition of post-stroke patients. This feature is Delta, Alpha, Beta, Theta, and Mu waves. Moreover, the five waves, amplitude features are also added according to the characteristics of the post-stroke EEG signal. The results showed that the feature configuration is essential as distinguish. The accuracy of the testing data was 90% with amplitude and Beta features compared to 70% without amplitude or Beta. The experimental results also showed that adaptive moment estimation (Adam) optimization model was more stable compared to Stochastic gradient descent (SGD). But SGD can provide higher accuracy than the Adam model.

Download Full-text

Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/452 ◽

2020 ◽

Author(s):

Jinghui Chen ◽

Dongruo Zhou ◽

Yiqi Tang ◽

Ziyan Yang ◽

Yuan Cao ◽

...

Keyword(s):

Neural Networks ◽

Convergence Rate ◽

Gradient Descent ◽

Deep Neural Networks ◽

Gradient Methods ◽

Estimation Method ◽

Fast Convergence ◽

Stochastic Gradient Descent ◽

Adaptive Parameter ◽

Fast Convergence Rate

Adaptive gradient methods, which adopt historical gradient information to automatically adjust the learning rate, despite the nice property of fast convergence, have been observed to generalize worse than stochastic gradient descent (SGD) with momentum in training deep neural networks. This leaves how to close the generalization gap of adaptive gradient methods an open problem. In this work, we show that adaptive gradient methods such as Adam, Amsgrad, are sometimes "over adapted". We design a new algorithm, called Partially adaptive momentum estimation method, which unifies the Adam/Amsgrad with SGD by introducing a partial adaptive parameter $p$, to achieve the best from both worlds. We also prove the convergence rate of our proposed algorithm to a stationary point in the stochastic nonconvex optimization setting. Experiments on standard benchmarks show that our proposed algorithm can maintain fast convergence rate as Adam/Amsgrad while generalizing as well as SGD in training deep neural networks. These results would suggest practitioners pick up adaptive gradient methods once again for faster training of deep neural networks.

Download Full-text

Evolving Convolutional Neural Networks for Glaucoma Diagnosis

10.5753/sbcas.2018.3687 ◽

2018 ◽

Cited By ~ 1

Author(s):

Alan Lima ◽

Lucas B. Maia ◽

Pedro Thiago Cutrim Dos Santos ◽

Geraldo Braz Júnior ◽

João D. S. De Almeida ◽

...

Keyword(s):

Genetic Algorithm ◽

Neural Networks ◽

Optic Nerve ◽

Visual Field ◽

Convolutional Neural Networks ◽

Ocular Disease ◽

Fundus Images ◽

Advanced Stage ◽

Fundus Image ◽

Glaucoma Diagnosis

Glaucoma is an ocular disease that causes damage to the eye's optic nerve and successive narrowing of the visual field in affected patients which can lead the patient, in advanced stage, to blindness. This work presents a study on the use of Convolutional Neural Networks (CNNs) for the automatic diagnosis through eye fundus images. However, building a perfect CNN involves a lot of effort that in many situations is not always able to achieve satisfactory results. The objective of this work is to use a Genetic Algorithm (GA) to optimize CNNs architectures through evolution that can helps in glaucoma diagnosis using eye's fundus image from RIM-ONE-r2 dataset. Our partial results demonstrate satisfactory results after training the best individual chosen by GA with the achievement of an accuracy of 91%.

Download Full-text