Performance Comparison of CNN Models Using Gradient Flow Analysis

Seol-Hyun Noh

doi:10.3390/informatics8030053

Performance Comparison of CNN Models Using Gradient Flow Analysis

Informatics ◽

10.3390/informatics8030053 ◽

2021 ◽

Vol 8 (3) ◽

pp. 53

Author(s):

Seol-Hyun Noh

Keyword(s):

Neural Networks ◽

Language Processing ◽

Input Data ◽

Flow Analysis ◽

Gradient Flow ◽

Performance Comparison ◽

Superior Performance ◽

Gradient Flows ◽

The Neural Network ◽

Learning Techniques

Convolutional neural networks (CNNs) are widely used among the various deep learning techniques available because of their superior performance in the fields of computer vision and natural language processing. CNNs can effectively extract the locality and correlation of input data using structures in which convolutional layers are successively applied to the input data. In general, the performance of neural networks has improved as the depth of CNNs has increased. However, an increase in the depth of a CNN is not always accompanied by an increase in the accuracy of the neural network. This is because the gradient vanishing problem may arise, causing the weights of the weighted layers to fail to converge. Accordingly, the gradient flows of the VGGNet, ResNet, SENet, and DenseNet models were analyzed and compared in this study, and the reasons for the differences in the error rate performances of the models were derived.

Download Full-text

Artificial Neural Networks and Learning Techniques

Psychology and Mental Health ◽

10.4018/978-1-5225-0159-6.ch004 ◽

2016 ◽

pp. 89-112

Author(s):

Pushpendu Kar ◽

Anusua Das

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Research Work ◽

The Neural Network ◽

Learning Techniques ◽

Learning Procedure ◽

Processing Component ◽

Artificial Neural ◽

Artificial Neural Network Ann

The recent craze for artificial neural networks has spread its roots towards the development of neuroscience, pattern recognition, machine learning and artificial intelligence. The theoretical neuroscience is basically converging towards the basic concept that the brain acts as a complex and decentralized computer which can perform rigorous calculations in a different approach compared to the conventional digital computers. The motivation behind the study of neural networks is due to their similarity in the structure of the human central nervous system. The elementary processing component of an Artificial Neural Network (ANN) is called as ‘Neuron'. A large number of neurons interconnected with each other mimic the biological neural network and form an ANN. Learning is an inevitable process that can be used to train an ANN. We can only transfer knowledge to the neural network by the learning procedure. This chapter presents the detailed concepts of artificial neural networks in addition to some significant aspects on the present research work.

Download Full-text

Machine Learning in Gifted Education: A Demonstration Using Neural Networks

Gifted Child Quarterly ◽

10.1177/0016986219867483 ◽

2019 ◽

Vol 63 (4) ◽

pp. 243-252 ◽

Cited By ~ 1

Author(s):

Jaret Hodges ◽

Soumya Mohan

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Supervised Learning ◽

Language Processing ◽

Gifted Education ◽

Learning Algorithms ◽

Simulated Data ◽

Machine Learning Algorithms ◽

Automated Driving ◽

The Neural Network

Machine learning algorithms are used in language processing, automated driving, and for prediction. Though the theory of machine learning has existed since the 1950s, it was not until the advent of advanced computing that their potential has begun to be realized. Gifted education is a field where machine learning has yet to be utilized, even though one of the underlying problems of gifted education is classification, which is an area where learning algorithms have become exceptionally accurate. We provide a brief overview of machine learning with a focus on neural networks and supervised learning, followed by a demonstration using simulated data and neural networks for classification issues with a practical explanation of the mechanics of the neural network and associated R code. Implications for gifted education are then discussed. Finally, the limitations of supervised learning are discussed. Code used in this article can be found at https://osf.io/4pa3b/

Download Full-text

Towards Better Interpretability in Deep Q-Networks

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014561 ◽

2019 ◽

Vol 33 ◽

pp. 4561-4569 ◽

Cited By ~ 2

Author(s):

Raghuram Mandyam Annasamy ◽

Katia Sycara

Keyword(s):

Neural Network ◽

Network Architecture ◽

Empirical Studies ◽

Superior Performance ◽

Training Algorithms ◽

Neural Network Architecture ◽

Q Learning ◽

The Neural Network ◽

Learning Techniques ◽

Out Of Sample

Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model’s behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.

Download Full-text

Use of artificial intelligence in analytical systems for the clinical laboratory

Journal of Automatic Chemistry ◽

10.1155/s1463924695000010 ◽

1995 ◽

Vol 17 (1) ◽

pp. 1-15 ◽

Cited By ~ 3

Author(s):

John F. Place ◽

Alain Truchaud ◽

Kyoichi Ozawa ◽

Harry Pardue ◽

Paul Schnipelsky

Keyword(s):

Artificial Intelligence ◽

Neural Networks ◽

Expert Systems ◽

Language Processing ◽

Knowledge Engineering ◽

Clinical Laboratory ◽

Failure Detection ◽

Domain Experts ◽

Clinical Laboratories ◽

The Neural Network

The incorporation of information-processing technology into analytical systems in the form of standard computing software has recently been advanced by the introduction of artificial intelligence (AI), both as expert systems and as neural networks.This paper considers the role of software in system operation, control and automation, and attempts to define intelligence. AI is characterized by its ability to deal with incomplete and imprecise information and to accumulate knowledge. Expert systems, building on standard computing techniques, depend heavily on the domain experts and knowledge engineers that have programmed them to represent the real world. Neural networks are intended to emulate the pattern-recognition and parallel processing capabilities of the human brain and are taught rather than programmed. The future may lie in a combination of the recognition ability of the neural network and the rationalization capability of the expert system.In the second part of the paper, examples are given of applications of AI in stand-alone systems for knowledge engineering and medical diagnosis and in embedded systems for failure detection, image analysis, user interfacing, natural language processing, robotics and machine learning, as related to clinical laboratories.It is concluded that AI constitutes a collective form of intellectual propery, and that there is a need for better documentation, evaluation and regulation of the systems already being used in clinical laboratories.

Download Full-text

Performance Comparison of Natural Language Processing Model Based on Deep Neural Networks

The Journal of Korean Institute of Communications and Information Sciences ◽

10.7840/kics.2019.44.7.1344 ◽

2019 ◽

Vol 44 (7) ◽

pp. 1344-1350

Author(s):

Taegyeom Lee ◽

Kyungseop Shin

Keyword(s):

Neural Networks ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Deep Neural Networks ◽

Performance Comparison ◽

Model Based

Download Full-text

Going Deeper with Densely Connected Convolutional Neural Networks for Multispectral Pansharpening

Remote Sensing ◽

10.3390/rs11222608 ◽

2019 ◽

Vol 11 (22) ◽

pp. 2608 ◽

Cited By ~ 4

Author(s):

Dong Wang ◽

Ying Li ◽

Li Ma ◽

Zongwen Bai ◽

Jonathan Chan

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Gradient Flow ◽

Superior Performance ◽

Small Scale ◽

Feature Maps ◽

Spectral Angle Mapper ◽

Residual Learning ◽

Input Source ◽

Scale Data

In recent years, convolutional neural networks (CNNs) have shown promising performance in the field of multispectral (MS) and panchromatic (PAN) image fusion (MS pansharpening). However, the small-scale data and the gradient vanishing problem have been preventing the existing CNN-based fusion approaches from leveraging deeper networks that potentially have better representation ability to characterize the complex nonlinear mapping relationship between the input (source) and the targeting (fused) images. In this paper, we introduce a very deep network with dense blocks and residual learning to tackle these problems. The proposed network takes advantage of dense connections in dense blocks that have connections for arbitrarily two convolution layers to facilitate gradient flow and implicit deep supervision during training. In addition, reusing feature maps can reduce the number of parameters, which is helpful for reducing overfitting that resulted from small-scale data. Residual learning is explored to reduce the difficulty for the model to generate the MS image with high spatial resolution. The proposed network is evaluated via experiments on three datasets, achieving competitive or superior performance, e.g. the spectral angle mapper (SAM) is decreased over 10% on GaoFen-2, when compared with other state-of-the-art methods.

Download Full-text

A Survey on Prevention of Overfitting in Convolution Neural Networks Using Machine Learning Techniques

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.32.15399 ◽

2018 ◽

Vol 7 (2.32) ◽

pp. 177 ◽

Cited By ~ 1

Author(s):

Dr M.R.Narasinga Rao ◽

V Venkatesh Prasad ◽

P Sai Teja ◽

Md Zindavali ◽

O Phanindra Reddy

Keyword(s):

Neural Networks ◽

Neural Nets ◽

Machine Learning Techniques ◽

Test Time ◽

Document Type ◽

The Neural Network ◽

Learning Techniques ◽

Extreme Problem ◽

Regularization Techniques ◽

The Impact

Deep neural nets with a vast quantity of parameters are very effective machine getting to know structures. However, overfitting is an extreme problem in such networks. Massive networks are also sluggish to use, making it difficult to cope with overfitting by combining the predictions of many distinct large neural nets at test time. Dropout is a method for addressing this problem. The important thing concept is to randomly drop units (at the side of their connections) from the neural network for the duration of education. This prevents units from co-adapting an excessive amount of. during schooling, dropout samples from an exponential quantity of various "thinned" networks. At take a look at the time, it is simple to precise the impact of averaging the predictions of plenty of these thinned networks through in reality using a single unthinned network that has smaller weights. This considerably minimize overfitting and provides fundamental enhancements over other regularization techniques. We show that dropout enhance the overall performance of neural networks on manage gaining knowledge of obligations in imaginative and prescient, speech reputation, document type and computational biology, acquiring today's effects on many benchmark facts sets.

Download Full-text

Arabic text summarization using deep learning approach

Journal Of Big Data ◽

10.1186/s40537-020-00386-7 ◽

2020 ◽

Vol 7 (1) ◽

Author(s):

Molham Al-Maleh ◽

Said Desouki

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Language Processing ◽

Network Models ◽

Arabic Language ◽

Text Summarization ◽

Neural Network Models ◽

Data Set ◽

Learning Techniques ◽

Arabic Text Summarization

AbstractNatural language processing has witnessed remarkable progress with the advent of deep learning techniques. Text summarization, along other tasks like text translation and sentiment analysis, used deep neural network models to enhance results. The new methods of text summarization are subject to a sequence-to-sequence framework of encoder–decoder model, which is composed of neural networks trained jointly on both input and output. Deep neural networks take advantage of big datasets to improve their results. These networks are supported by the attention mechanism, which can deal with long texts more efficiently by identifying focus points in the text. They are also supported by the copy mechanism that allows the model to copy words from the source to the summary directly. In this research, we are re-implementing the basic summarization model that applies the sequence-to-sequence framework on the Arabic language, which has not witnessed the employment of this model in the text summarization before. Initially, we build an Arabic data set of summarized article headlines. This data set consists of approximately 300 thousand entries, each consisting of an article introduction and the headline corresponding to this introduction. We then apply baseline summarization models to the previous data set and compare the results using the ROUGE scale.

Download Full-text

Application of Neural Networks in Forecasting SSE Composite Index

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.321-324.1921 ◽

2013 ◽

Vol 321-324 ◽

pp. 1921-1924

Author(s):

Yong Gang Xue ◽

Ming Li Zhang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Network ◽

Wavelet Analysis ◽

Composite Index ◽

Superior Performance ◽

Forecasting Models ◽

The Neural Network ◽

Forecasting Performance ◽

Wavelet Techniques

The methodology is proposed to forecast the daily SSE Composite Index based on artificial neural network and wavelet analysis. The original Composite Index series is decomposed into various components using wavelet techniques at first. The neural network is applied for modeling components of the decomposed series. The final forecast is obtained by combining the components series forecasts. The empirical results show the superior performance of the proposed methodology as compared to the neural network forecasting models. In addition, the results show the obvious difference among different type network in forecasting performance.

Download Full-text

Learning deep linear neural networks: Riemannian gradient flows and convergence to global minimizers

Information and Inference A Journal of the IMA ◽

10.1093/imaiai/iaaa039 ◽

2021 ◽

Author(s):

Bubacarr Bah ◽

Holger Rauhut ◽

Ulrich Terstiege ◽

Michael Westdickenberg

Keyword(s):

Neural Networks ◽

Critical Point ◽

Global Minimum ◽

Gradient Flow ◽

Activation Function ◽

Riemannian Metric ◽

Gradient Flows ◽

Global Minimizers ◽

Network Layers ◽

Almost All

Abstract We study the convergence of gradient flows related to learning deep linear neural networks (where the activation function is the identity map) from data. In this case, the composition of the network layers amounts to simply multiplying the weight matrices of all layers together, resulting in an overparameterized problem. The gradient flow with respect to these factors can be re-interpreted as a Riemannian gradient flow on the manifold of rank-$r$ matrices endowed with a suitable Riemannian metric. We show that the flow always converges to a critical point of the underlying functional. Moreover, we establish that, for almost all initializations, the flow converges to a global minimum on the manifold of rank $k$ matrices for some $k\leq r$.

Download Full-text