Using Summary Layers to Probe Neural Network Behaviour

No framework exists that can explain and predict the generalisation ability of deep neural networks in general circumstances. In fact, this question has not been answered for some of the least complicated of neural network architectures: fully-connected feedforward networks with rectified linear activations and a limited number of hidden layers. For such an architecture, we show how adding a summary layer to the network makes it more amenable to analysis, and allows us to define the conditions that are required to guarantee that a set of samples will all be classified correctly. This process does not describe the generalisation behaviour of these networks, but produces a number of metrics that are useful for probing their learning and generalisation behaviour. We support the analytical conclusions with empirical results, both to confirm that the mathematical guarantees hold in practice, and to demonstrate the use of the analysis process.

Download Full-text

Neural networks for classification of strokes in electrical impedance tomography on a 3D head model

Mathematics in Engineering ◽

10.3934/mine.2022029 ◽

2022 ◽

Vol 4 (4) ◽

pp. 1-22

Author(s):

Valentina Candiani ◽

◽

Matteo Santacesaria ◽

Keyword(s):

Neural Network ◽

Neural Networks ◽

Electrical Impedance Tomography ◽

Electrical Impedance ◽

Network Architectures ◽

Impedance Tomography ◽

Average Accuracy ◽

Neural Network Architectures ◽

Fully Connected

<abstract><p>We consider the problem of the detection of brain hemorrhages from three-dimensional (3D) electrical impedance tomography (EIT) measurements. This is a condition requiring urgent treatment for which EIT might provide a portable and quick diagnosis. We employ two neural network architectures - a fully connected and a convolutional one - for the classification of hemorrhagic and ischemic strokes. The networks are trained on a dataset with $ 40\, 000 $ samples of synthetic electrode measurements generated with the complete electrode model on realistic heads with a 3-layer structure. We consider changes in head anatomy and layers, electrode position, measurement noise and conductivity values. We then test the networks on several datasets of unseen EIT data, with more complex stroke modeling (different shapes and volumes), higher levels of noise and different amounts of electrode misplacement. On most test datasets we achieve $ \geq 90\% $ average accuracy with fully connected neural networks, while the convolutional ones display an average accuracy $ \geq 80\% $. Despite the use of simple neural network architectures, the results obtained are very promising and motivate the applications of EIT-based classification methods on real phantoms and ultimately on human patients.</p></abstract>

Download Full-text

Interaffection of Multiple Datasets with Neural Networks in Speech Emotion Recognition

10.5753/eniac.2020.12141 ◽

2020 ◽

Author(s):

Ronnypetson Da Silva ◽

Valter M. Filho ◽

Mario Souza

Keyword(s):

Neural Network ◽

Neural Networks ◽

Emotion Recognition ◽

Deep Neural Networks ◽

Speech Emotion Recognition ◽

Network Architectures ◽

Shared Representations ◽

Multiple Datasets ◽

Neural Network Architectures

Many works that apply Deep Neural Networks (DNNs) to Speech Emotion Recognition (SER) use single datasets or train and evaluate the models separately when using multiple datasets. Those datasets are constructed with specific guidelines and the subjective nature of the labels for SER makes it difficult to obtain robust and general models. We investigate how DNNs learn shared representations for different datasets in both multi-task and unified setups. We also analyse how each dataset benefits from others in different combinations of datasets and popular neural network architectures. We show that the longstanding belief of more data resulting in more general models doesn’t always hold for SER, as different dataset and meta-parameter combinations hold the best result for each of the analysed datasets.

Download Full-text

HELLO: improved neural network architectures and methodologies for small variant calling

BMC Bioinformatics ◽

10.1186/s12859-021-04311-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Anand Ramachandran ◽

Steven S. Lumetta ◽

Eric W. Klee ◽

Deming Chen

Keyword(s):

Neural Network ◽

Neural Networks ◽

Image Recognition ◽

Deep Neural Network ◽

Deep Neural Networks ◽

Method Development ◽

Variant Calling ◽

Network Architectures ◽

Sequencing Data ◽

Neural Network Architectures

Abstract Background Modern Next Generation- and Third Generation- Sequencing methods such as Illumina and PacBio Circular Consensus Sequencing platforms provide accurate sequencing data. Parallel developments in Deep Learning have enabled the application of Deep Neural Networks to variant calling, surpassing the accuracy of classical approaches in many settings. DeepVariant, arguably the most popular among such methods, transforms the problem of variant calling into one of image recognition where a Deep Neural Network analyzes sequencing data that is formatted as images, achieving high accuracy. In this paper, we explore an alternative approach to designing Deep Neural Networks for variant calling, where we use meticulously designed Deep Neural Network architectures and customized variant inference functions that account for the underlying nature of sequencing data instead of converting the problem to one of image recognition. Results Results from 27 whole-genome variant calling experiments spanning Illumina, PacBio and hybrid Illumina-PacBio settings suggest that our method allows vastly smaller Deep Neural Networks to outperform the Inception-v3 architecture used in DeepVariant for indel and substitution-type variant calls. For example, our method reduces the number of indel call errors by up to 18%, 55% and 65% for Illumina, PacBio and hybrid Illumina-PacBio variant calling respectively, compared to a similarly trained DeepVariant pipeline. In these cases, our models are between 7 and 14 times smaller. Conclusions We believe that the improved accuracy and problem-specific customization of our models will enable more accurate pipelines and further method development in the field. HELLO is available at https://github.com/anands-repo/hello

Download Full-text

Dropout with Tabu Strategy for Regularizing Deep Neural Networks

The Computer Journal ◽

10.1093/comjnl/bxz062 ◽

2019 ◽

Vol 63 (7) ◽

pp. 1031-1038

Author(s):

Zongjie Ma ◽

Abdul Sattar ◽

Jun Zhou ◽

Qingliang Chen ◽

Kaile Su

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Network Architectures ◽

Computationally Efficient ◽

Diversification Strategy ◽

Working Mechanism ◽

Training Stage ◽

Neural Network Architectures ◽

Public Datasets

Abstract Dropout has been proven to be an effective technique for regularizing and preventing the co-adaptation of neurons in deep neural networks (DNN). It randomly drops units with a probability of p during the training stage of DNN to avoid overfitting. The working mechanism of dropout can be interpreted as approximately and exponentially combining many different neural network architectures efficiently, leading to a powerful ensemble. In this work, we propose a novel diversification strategy for dropout, which aims at generating more different neural network architectures in less numbers of iterations. The dropped units in the last forward propagation will be marked. Then the selected units for dropping in the current forward propagation will be retained if they have been marked in the last forward propagation, i.e., we only mark the units from the last forward propagation. We call this new regularization scheme Tabu dropout, whose significance lies in that it does not have extra parameters compared with the standard dropout strategy and is computationally efficient as well. Experiments conducted on four public datasets show that Tabu dropout improves the performance of the standard dropout, yielding better generalization capability.

Download Full-text

SURVEY ON EVOLVING DEEP LEARNING NEURAL NETWORK ARCHITECTURES

Journal of Artificial Intelligence and Capsule Networks - September 2019 ◽

10.36548/jaicn.2019.2.003 ◽

2019 ◽

Vol 2019 (2) ◽

pp. 73-82 ◽

Cited By ~ 32

Author(s):

Dr. Abul Bashar

Keyword(s):

Neural Network ◽

Machine Learning ◽

Neural Networks ◽

Feature Extraction ◽

Deep Learning ◽

Human Performance ◽

Deep Neural Networks ◽

Network Architectures ◽

Neural Network Architectures ◽

Deep Learning Neural Network

The deep learning being a subcategory of the machine learning follows the human instincts of learning by example to produce accurate results. The deep learning performs training to the computer frame work to directly classify the tasks from the documents available either in the form of the text, image, or the sound. Most often the deep learning utilizes the neural network to perform the accurate classification and is referred as the deep neural networks; one of the most common deep neural networks used in a broader range of applications is the convolution neural network that provides an automated way of feature extraction by learning the features directly from the images or the text unlike the machine learning that extracts the features manually. This enables the deep learning neural networks to have a state of art accuracy that mostly expels even the human performance. So the paper is to present the survey on the deep learning neural network architectures utilized in various applications for having an accurate classification with an automated feature extraction.

Download Full-text

A HYBRID MODEL USING THE PRETRAINED BERT AND DEEP NEURAL NETWORKS WITH RICH FEATURE FOR EXTRACTIVE TEXT SUMMARIZATION

Journal of Computer Science and Cybernetics ◽

10.15625/1813-9663/37/2/15980 ◽

2021 ◽

Vol 37 (2) ◽

pp. 123-143

Author(s):

Tuan Minh Luu ◽

Huong Thanh Le ◽

Tan Minh Hoang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Neural Networks ◽

Text Summarization ◽

Training Dataset ◽

Extractive Summarization ◽

Input Text ◽

Summarization System ◽

Fully Connected

Deep neural networks have been applied successfully to extractive text summarization tasks with the accompany of large training datasets. However, when the training dataset is not large enough, these models reveal certain limitations that affect the quality of the system’s summary. In this paper, we propose an extractive summarization system basing on a Convolutional Neural Network and a Fully Connected network for sentence selection. The pretrained BERT multilingual model is used to generate embeddings vectors from the input text. These vectors are combined with TF-IDF values to produce the input of the text summarization system. Redundant sentences from the output summary are eliminated by the Maximal Marginal Relevance method. Our system is evaluated with both English and Vietnamese languages using CNN and Baomoi datasets, respectively. Experimental results show that our system achieves better results comparing to existing works using the same dataset. It confirms that our approach can be effectively applied to summarize both English and Vietnamese languages.

Download Full-text

Interpolation Consistency Training for Semi-supervised Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/504 ◽

2019 ◽

Cited By ~ 39

Author(s):

Vikas Verma ◽

Alex Lamb ◽

Juho Kannala ◽

Yoshua Bengio ◽

David Lopez-Paz

Keyword(s):

Neural Network ◽

Neural Networks ◽

Supervised Learning ◽

Deep Neural Networks ◽

State Of The Art ◽

Data Distribution ◽

Network Architectures ◽

Low Density ◽

Decision Boundary ◽

Classification Problems

We introduce Interpolation Consistency Training (ICT), a simple and computation efficient algorithm for training Deep Neural Networks in the semi-supervised learning paradigm. ICT encourages the prediction at an interpolation of unlabeled points to be consistent with the interpolation of the predictions at those points. In classification problems, ICT moves the decision boundary to low-density regions of the data distribution. Our experiments show that ICT achieves state-of-the-art performance when applied to standard neural network architectures on the CIFAR-10 and SVHN benchmark dataset.

Download Full-text

Network Approximation using Tensor Sketching

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/321 ◽

2018 ◽

Cited By ~ 1

Author(s):

Shiva Prasad Kasiviswanathan ◽

Nina Narodytska ◽

Hongxia Jin

Keyword(s):

Neural Networks ◽

Language Processing ◽

Network Architecture ◽

Deep Neural Networks ◽

Network Architectures ◽

Effective Parameters ◽

Unified Framework ◽

Design Changes ◽

Target Network ◽

Fully Connected

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.

Download Full-text

Swarm-Based Nature-Inspired Metaheuristics for Neural Network Optimization

Advances in Computational Intelligence and Robotics - Handbook of Research on Modeling, Analysis, and Application of Nature-Inspired Metaheuristic Algorithms ◽

10.4018/978-1-5225-2857-9.ch002 ◽

2018 ◽

pp. 23-53

Author(s):

Swathi Jamjala Narayanan ◽

Boominathan Perumal ◽

Jayant G. Rohra

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Optimization ◽

Network Architectures ◽

Neural Network Optimization ◽

Local Optima ◽

Gradient Based ◽

Neural Network Architectures ◽

Nature Inspired Algorithms ◽

Nature Inspired Metaheuristics

Nature-inspired algorithms have been productively applied to train neural network architectures. There exist other mechanisms like gradient descent, second order methods, Levenberg-Marquardt methods etc. to optimize the parameters of neural networks. Compared to gradient-based methods, nature-inspired algorithms are found to be less sensitive towards the initial weights set and also it is less likely to become trapped in local optima. Despite these benefits, some nature-inspired algorithms also suffer from stagnation when applied to neural networks. The other challenge when applying nature inspired techniques for neural networks would be in handling large dimensional and correlated weight space. Hence, there arises a need for scalable nature inspired algorithms for high dimensional neural network optimization. In this chapter, the characteristics of nature inspired techniques towards optimizing neural network architectures along with its applicability, advantages and limitations/challenges are studied.

Download Full-text

IMPROVING GENERALIZATION OF NEURAL NETWORKS THROUGH PRUNING

International Journal of Neural Systems ◽

10.1142/s0129065791000352 ◽

1991 ◽

Vol 01 (04) ◽

pp. 317-326 ◽

Cited By ~ 49

Author(s):

Hans Henrik Thodberg

Keyword(s):

Neural Network ◽

Neural Networks ◽

Internal Representation ◽

Network Architectures ◽

Ockham’S Razor ◽

Ockham's Razor ◽

Neural Network Architectures

A technique for constructing neural network architectures with better ability to generalize is presented under the name Ockham's Razor: several networks are trained and then pruned by removing connections one by one and retraining. The networks which achieve fewest connections generalize best. The method is tested on a classification of bit strings (the contiguity problem): the optimal architecture emerges, resulting in perfect generalization. The internal representation of the network changes substantially during the retraining, and this distinguishes the method from previous pruning studies.

Download Full-text