Periodic modes of group dominance in fully coupled neural networks

Nonlinear systems of differential equations with delay, which are mathematical models of fully connected networks of impulse neurons, are considered. Purpose of this work is to study the dynamic properties of one special class of solutions to these systems. Large parameter methods are used to study the existence and stability in сonsidered models of special periodic motions – the so-called group dominance or k-dominance modes, where k ∈ N. Results. It is shown that each such regime is a relaxation cycle, exactly k components of which perform synchronous impulse oscillations, and all other components are asymptotically small. The maximum number of stable coexisting group dominance cycles in the system with an appropriate choice of parameters is 2m − 1, where m is the number of network elements. Conclusion. Considered model with maximum possible number of couplings allows us to describe the most complex and diverse behavior that may be observed in biological neural associations. A feature of the k-dominance modes we have considered is that some of the network neurons are in a non-working (refractory) state. Each periodic k-dominance mode can be associated with a binary vector (α1, α2, . . . , αm), where αj = 1 if the j-th neuron is active and αj = 0 otherwise. Taking this into account, we come to the conclusion that these modes can be used to build devices with associative memory based on artificial neural networks.

Download Full-text

Binary and Multiclass Text Classification by Means of Separable Convolutional Neural Network

Inventions ◽

10.3390/inventions6040070 ◽

2021 ◽

Vol 6 (4) ◽

pp. 70

Author(s):

Elena Solovyeva ◽

Ali Abdullah

Keyword(s):

Neural Network ◽

Neural Networks ◽

Convolutional Neural Network ◽

Recurrent Neural Networks ◽

Low Cost ◽

Computational Cost ◽

High Accuracy ◽

Activation Functions ◽

Fully Connected ◽

Fully Connected Networks

In this paper, the structure of a separable convolutional neural network that consists of an embedding layer, separable convolutional layers, convolutional layer and global average pooling is represented for binary and multiclass text classifications. The advantage of the proposed structure is the absence of multiple fully connected layers, which is used to increase the classification accuracy but raises the computational cost. The combination of low-cost separable convolutional layers and a convolutional layer is proposed to gain high accuracy and, simultaneously, to reduce the complexity of neural classifiers. Advantages are demonstrated at binary and multiclass classifications of written texts by means of the proposed networks under the sigmoid and Softmax activation functions in convolutional layer. At binary and multiclass classifications, the accuracy obtained by separable convolutional neural networks is higher in comparison with some investigated types of recurrent neural networks and fully connected networks.

Download Full-text

Convergence Behavior of DNNs with Mutual-Information-Based Regularization

Entropy ◽

10.3390/e22070727 ◽

2020 ◽

Vol 22 (7) ◽

pp. 727 ◽

Cited By ~ 1

Author(s):

Hlynur Jónsson ◽

Giovanni Cherubini ◽

Evangelos Eleftheriou

Keyword(s):

Neural Networks ◽

Mutual Information ◽

Low Complexity ◽

High Dimensional ◽

Test Accuracy ◽

Compression Phase ◽

Hidden Layer ◽

Low Dimensional ◽

Fully Connected ◽

Fully Connected Networks

Information theory concepts are leveraged with the goal of better understanding and improving Deep Neural Networks (DNNs). The information plane of neural networks describes the behavior during training of the mutual information at various depths between input/output and hidden-layer variables. Previous analysis revealed that most of the training epochs are spent on compressing the input, in some networks where finiteness of the mutual information can be established. However, the estimation of mutual information is nontrivial for high-dimensional continuous random variables. Therefore, the computation of the mutual information for DNNs and its visualization on the information plane mostly focused on low-complexity fully connected networks. In fact, even the existence of the compression phase in complex DNNs has been questioned and viewed as an open problem. In this paper, we present the convergence of mutual information on the information plane for a high-dimensional VGG-16 Convolutional Neural Network (CNN) by resorting to Mutual Information Neural Estimation (MINE), thus confirming and extending the results obtained with low-dimensional fully connected networks. Furthermore, we demonstrate the benefits of regularizing a network, especially for a large number of training epochs, by adopting mutual information estimates as additional terms in the loss function characteristic of the network. Experimental results show that the regularization stabilizes the test accuracy and significantly reduces its variance.

Download Full-text

Notes on the Symmetries of 2-Layer ReLU-Networks

Proceedings of the Northern Lights Deep Learning Workshop ◽

10.7557/18.5150 ◽

2020 ◽

Vol 1 ◽

pp. 6

Author(s):

Henning Petzka ◽

Martin Trimmel ◽

Cristian Sminchisescu

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Complete Characterization ◽

Activation Functions ◽

Network Function ◽

Fully Connected ◽

Fully Connected Networks

Symmetries in neural networks allow different weight configurations leading to the same network function. For odd activation functions, the set of transformations mapping between such configurations have been studied extensively, but less is known for neural networks with ReLU activation functions. We give a complete characterization for fully-connected networks with two layers. Apart from two well-known transformations, only degenerated situations allow additional transformations that leave the network function unchanged. Reduction steps can remove only part of the degenerated cases. Finally, we present a non-degenerate situation for deep neural networks leading to new transformations leaving the network function intact.

Download Full-text

VECA: A Method for Detecting Overfitting in Neural Networks (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7167 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13791-13792

Author(s):

Liangzhu Ge ◽

Yuexian Hou ◽

Yaju Jiang ◽

Shuai Yao ◽

Chao Yang

Keyword(s):

Neural Networks ◽

Strong Correlation ◽

Good Predictor ◽

Deep Neural Networks ◽

Training Data ◽

Training Process ◽

Generalization Performance ◽

Validation Set ◽

Fully Connected ◽

Fully Connected Networks

Despite their widespread applications, deep neural networks often tend to overfit the training data. Here, we propose a measure called VECA (Variance of Eigenvalues of Covariance matrix of Activation matrix) and demonstrate that VECA is a good predictor of networks' generalization performance during the training process. Experiments performed on fully-connected networks and convolutional neural networks trained on benchmark image datasets show a strong correlation between test loss and VECA, which suggest that we can calculate the VECA to estimate generalization performance without sacrificing training data to be used as a validation set.

Download Full-text

On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/355 ◽

2021 ◽

Author(s):

Wei Huang ◽

Weitao Du ◽

Richard Yi Da Xu

Keyword(s):

Neural Networks ◽

Empirical Investigation ◽

Linear Regime ◽

Nonlinear Networks ◽

Linear Networks ◽

Learning Speed ◽

Deep Networks ◽

Speed Up ◽

Fully Connected ◽

Fully Connected Networks

The prevailing thinking is that orthogonal weights are crucial to enforcing dynamical isometry and speeding up training. The increase in learning speed that results from orthogonal initialization in linear networks has been well-proven. However, while the same is believed to also hold for nonlinear networks when the dynamical isometry condition is satisfied, the training dynamics behind this contention have not been thoroughly explored. In this work, we study the dynamics of ultra-wide networks across a range of architectures, including Fully Connected Networks (FCNs) and Convolutional Neural Networks (CNNs) with orthogonal initialization via neural tangent kernel (NTK). Through a series of propositions and lemmas, we prove that two NTKs, one corresponding to Gaussian weights and one to orthogonal weights, are equal when the network width is infinite. Further, during training, the NTK of an orthogonally-initialized infinite-width network should theoretically remain constant. This suggests that the orthogonal initialization cannot speed up training in the NTK (lazy training) regime, contrary to the prevailing thoughts. In order to explore under what circumstances can orthogonality accelerate training, we conduct a thorough empirical investigation outside the NTK regime. We find that when the hyper-parameters are set to achieve a linear regime in nonlinear activation, orthogonal initialization can improve the learning speed with a large learning rate or large depth.

Download Full-text

New stability condition for discrete-time fully coupled neural networks with multivalued neurons

Neurocomputing ◽

10.1016/j.neucom.2015.04.036 ◽

2015 ◽

Vol 166 ◽

pp. 38-43 ◽

Cited By ~ 3

Author(s):

Wei Zhou ◽

Jacek M. Zurada

Keyword(s):

Neural Networks ◽

Discrete Time ◽

Stability Condition ◽

Coupled Neural Networks ◽

Fully Coupled

Download Full-text

Topological measurement of deep neural networks using persistent homology

Annals of Mathematics and Artificial Intelligence ◽

10.1007/s10472-021-09761-3 ◽

2021 ◽

Author(s):

Satoru Watanabe ◽

Hayato Yamana

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Persistent Homology ◽

Topological Data Analysis ◽

Data Sets ◽

One Dimensional ◽

Novel Approach ◽

The One ◽

Fully Connected ◽

Fully Connected Networks

AbstractThe inner representation of deep neural networks (DNNs) is indecipherable, which makes it difficult to tune DNN models, control their training process, and interpret their outputs. In this paper, we propose a novel approach to investigate the inner representation of DNNs through topological data analysis (TDA). Persistent homology (PH), one of the outstanding methods in TDA, was employed for investigating the complexities of trained DNNs. We constructed clique complexes on trained DNNs and calculated the one-dimensional PH of DNNs. The PH reveals the combinational effects of multiple neurons in DNNs at different resolutions, which is difficult to be captured without using PH. Evaluations were conducted using fully connected networks (FCNs) and networks combining FCNs and convolutional neural networks (CNNs) trained on the MNIST and CIFAR-10 data sets. Evaluation results demonstrate that the PH of DNNs reflects both the excess of neurons and problem difficulty, making PH one of the prominent methods for investigating the inner representation of DNNs.

Download Full-text

Numerical modeling of continuous-time fully coupled neural networks

[Proceedings] 1991 IEEE International Joint Conference on Neural Networks ◽

10.1109/ijcnn.1991.170655 ◽

1991 ◽

Cited By ~ 1

Author(s):

J.M. Zurada ◽

M.J. Kang

Keyword(s):

Neural Networks ◽

Numerical Modeling ◽

Continuous Time ◽

Coupled Neural Networks ◽

Fully Coupled

Download Full-text

Optimizing Convolutional Neural Network Parameters for Better Image Classification

10.36227/techrxiv.12089358 ◽

2020 ◽

Author(s):

Manik Dhingra ◽

Sarthak Rawat ◽

Jinan Fiaidhi

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Classification ◽

Web Service ◽

Recognition Task ◽

Extreme Learning Machines ◽

Data Set ◽

Learning Machines ◽

Fully Connected ◽

Fully Connected Networks

The work presented here works on getting higher performances for image recognition task using convolutional neural networks on the MNIST handwritten digits data-set. A range of techniques are compared for improvements with respect to time and accuracy, such as using one-shot Extreme Learning Machines (ELM) in place of the iteratively tuned fully-connected networks for classification, using transfer learning for faster convergence of image classification, and improving the size of data-set and making robust models by image augmentation. The final implementation is hosted on cloud as a web-service for better visualization of the prediction results.

Download Full-text

Towards Scalable Complete Verification of Relu Neural Networks via Dependency-based Branching

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/364 ◽

2021 ◽

Author(s):

Panagiotis Kouvaros ◽

Alessio Lomuscio

Keyword(s):

Neural Network ◽

Neural Networks ◽

Efficient Method ◽

State Of The Art ◽

Verification Problem ◽

Feed Forward Neural Networks ◽

Network Verification ◽

Performance Gains ◽

Fully Connected ◽

Fully Connected Networks

We introduce an efficient method for the complete verification of ReLU-based feed-forward neural networks. The method implements branching on the ReLU states on the basis of a notion of dependency between the nodes. This results in dividing the original verification problem into a set of sub-problems whose MILP formulations require fewer integrality constraints. We evaluate the method on all of the ReLU-based fully connected networks from the first competition for neural network verification. The experimental results obtained show 145% performance gains over the present state-of-the-art in complete verification.

Download Full-text