Scaling up Differentially Private Deep Learning with Fast Per-Example Gradient Clipping

AbstractRecent work on Renyi Differential Privacy has shown the feasibility of applying differential privacy to deep learning tasks. Despite their promise, however, differentially private deep networks often lag far behind their non-private counterparts in accuracy, showing the need for more research in model architectures, optimizers, etc. One of the barriers to this expanded research is the training time — often orders of magnitude larger than training non-private networks. The reason for this slowdown is a crucial privacy-related step called “per-example gradient clipping” whose naive implementation undoes the benefits of batch training with GPUs. By analyzing the back-propagation equations we derive new methods for per-example gradient clipping that are compatible with auto-differeniation (e.g., in Py-Torch and TensorFlow) and provide better GPU utilization. Our implementation in PyTorch showed significant training speed-ups (by factors of 54x - 94x for training various models with batch sizes of 128). These techniques work for a variety of architectural choices including convolutional layers, recurrent networks, attention, residual blocks, etc.

Download Full-text

Deep learning approaches for neural decoding across architectures and recording modalities

Briefings in Bioinformatics ◽

10.1093/bib/bbaa355 ◽

2020 ◽

Author(s):

Jesse A Livezey ◽

Joshua I Glaser

Keyword(s):

Deep Learning ◽

Scientific Development ◽

Cognitive State ◽

Learning Approaches ◽

Neural Decoding ◽

New Wave ◽

Neural Signals ◽

Learning Tasks ◽

Wide Range ◽

Deep Networks

Abstract Decoding behavior, perception or cognitive state directly from neural signals is critical for brain–computer interface research and an important tool for systems neuroscience. In the last decade, deep learning has become the state-of-the-art method in many machine learning tasks ranging from speech recognition to image segmentation. The success of deep networks in other domains has led to a new wave of applications in neuroscience. In this article, we review deep learning approaches to neural decoding. We describe the architectures used for extracting useful features from neural recording modalities ranging from spikes to functional magnetic resonance imaging. Furthermore, we explore how deep learning has been leveraged to predict common outputs including movement, speech and vision, with a focus on how pretrained deep networks can be incorporated as priors for complex decoding targets like acoustic speech or images. Deep learning has been shown to be a useful tool for improving the accuracy and flexibility of neural decoding across a wide range of tasks, and we point out areas for future scientific development.

Download Full-text

Creating, Managing, and Understanding Large, Sparse, Multitask Neural Networks

10.31219/osf.io/bv4qp ◽

2020 ◽

Author(s):

Harshvardhan Sikka

Keyword(s):

Neural Network ◽

Neural Networks ◽

Deep Learning ◽

Text Classification ◽

Multitask Learning ◽

Learning Tasks ◽

Deep Networks ◽

Classification Tasks

One of the popular directions in Deep Learning (DL) research has been to build larger and more complex deep networks that can perform well on several different learning tasks, commonly known as multitask learning. This work is usually done within specific domains, e.g. multitask models that perform captioning, translation, and text classification tasks. Some work has been done in building multimodal/crossmodal networks that use deep networks with a combination of different neural network primitives (Convolutional Layers, Recurrent Layers, Mixture of Expert layers, etc). This paper explores various topics and ideas that may prove relevant to large, sparse, multitask networks and explores the potential for a general approach to building and managing these networks. A framework to automatically build, update, and interpret modular LSMNs is presented in the context of current tooling and theory.

Download Full-text

The HSIC Bottleneck: Deep Learning without Back-Propagation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5950 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5085-5092 ◽

Cited By ~ 1

Author(s):

Wan-Duo Kurt Ma ◽

J. P. Lewis ◽

W. Bastiaan Kleijn

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Deep Neural Networks ◽

Back Propagation ◽

Single Layer ◽

Cross Entropy ◽

Entropy Loss ◽

Deep Networks ◽

Independence Criterion

We introduce the HSIC (Hilbert-Schmidt independence criterion) bottleneck for training deep neural networks. The HSIC bottleneck is an alternative to the conventional cross-entropy loss and backpropagation that has a number of distinct advantages. It mitigates exploding and vanishing gradients, resulting in the ability to learn very deep networks without skip connections. There is no requirement for symmetric feedback or update locking. We find that the HSIC bottleneck provides performance on MNIST/FashionMNIST/CIFAR10 classification comparable to backpropagation with a cross-entropy target, even when the system is not encouraged to make the output resemble the classification labels. Appending a single layer trained with SGD (without backpropagation) to reformat the information further improves performance.

Download Full-text

Some statistical and CI models to predict chaotic high-frequency financial data

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189107 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6419-6430

Author(s):

Dusan Marcek

Keyword(s):

Time Series Data ◽

Moving Average ◽

Methodological Approach ◽

Back Propagation ◽

Large Data ◽

Series Data ◽

Data Set ◽

Training Time ◽

Optimal Population ◽

Forecast Time

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.

Download Full-text

Comparison of Naive Bayes, Back Propagation, And Deep Learning algorithm to Measure the Performance Using Datasets

i-manager’s Journal on Software Engineering ◽

10.26634/jse.11.2.13443 ◽

2016 ◽

Vol 11 (2) ◽

pp. 1

Author(s):

SHARMILA J. ◽

Keyword(s):

Deep Learning ◽

Naive Bayes ◽

Learning Algorithm ◽

Back Propagation ◽

Naïve Bayes ◽

Deep Learning Algorithm

Download Full-text

An Anatomy of a Hybrid Color Descriptor with a Neural Network Model to Enhance the Retrieval Accuracy of an Image Retrieval System

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666191122113801 ◽

2019 ◽

Vol 13 ◽

Author(s):

Shikha Bhardwaj ◽

Gitanjali Pandove ◽

Pawan Kumar Dahiya

Keyword(s):

Neural Network ◽

Deep Learning ◽

Image Retrieval ◽

Hybrid System ◽

Back Propagation ◽

Back Propagation Neural Network ◽

Retrieval Accuracy ◽

Color Descriptor ◽

Benchmark Datasets ◽

Color Moment

Background: In order to retrieve a particular image from vast repository of images, an efficient system is required and such an eminent system is well-known by the name Content-based image retrieval (CBIR) system. Color is indeed an important attribute of an image and the proposed system consist of a hybrid color descriptor which is used for color feature extraction. Deep learning, has gained a prominent importance in the current era. So, the performance of this fusion based color descriptor is also analyzed in the presence of Deep learning classifiers. Method: This paper describes a comparative experimental analysis on various color descriptors and the best two are chosen to form an efficient color based hybrid system denoted as combined color moment-color autocorrelogram (Co-CMCAC). Then, to increase the retrieval accuracy of the hybrid system, a Cascade forward back propagation neural network (CFBPNN) is used. The classification accuracy obtained by using CFBPNN is also compared to Patternnet neural network. Results: The results of the hybrid color descriptor depict that the proposed system has superior results of the order of 95.4%, 88.2%, 84.4% and 96.05% on Corel-1K, Corel-5K, Corel-10K and Oxford flower benchmark datasets respectively as compared to many state-of-the-art related techniques. Conclusion: This paper depict an experimental and analytical analysis on different color feature descriptors namely, Color moment (CM), Color auto-correlogram (CAC), Color histogram (CH), Color coherence vector (CCV) and Dominant color descriptor (DCD). The proposed hybrid color descriptor (Co-CMCAC) is utilized for the withdrawal of color features with Cascade forward back propagation neural network (CFBPNN) is used as a classifier on four benchmark datasets namely Corel-1K, Corel-5K and Corel-10K and Oxford flower.

Download Full-text

High-Resolution ISAR Imaging and Autofocusing via 2D-ADMM-Net

Remote Sensing ◽

10.3390/rs13122326 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2326

Author(s):

Xiaoyong Li ◽

Xueru Bai ◽

Feng Zhou

Keyword(s):

Deep Learning ◽

High Resolution ◽

Phase Error ◽

Random Phase ◽

Back Propagation ◽

Estimation Method ◽

Data Driven ◽

Reconstruction Performance ◽

Imaging Results ◽

Isar Imaging

A deep-learning architecture, dubbed as the 2D-ADMM-Net (2D-ADN), is proposed in this article. It provides effective high-resolution 2D inverse synthetic aperture radar (ISAR) imaging under scenarios of low SNRs and incomplete data, by combining model-based sparse reconstruction and data-driven deep learning. Firstly, mapping from ISAR images to their corresponding echoes in the wavenumber domain is derived. Then, a 2D alternating direction method of multipliers (ADMM) is unrolled and generalized to a deep network, where all adjustable parameters in the reconstruction layers, nonlinear transform layers, and multiplier update layers are learned by an end-to-end training through back-propagation. Since the optimal parameters of each layer are learned separately, 2D-ADN exhibits more representation flexibility and preferable reconstruction performance than model-driven methods. Simultaneously, it is able to better facilitate ISAR imaging with limited training samples than data-driven methods owing to its simple structure and small number of adjustable parameters. Additionally, benefiting from the good performance of 2D-ADN, a random phase error estimation method is proposed, through which well-focused imaging can be acquired. It is demonstrated by experiments that although trained by only a few simulated images, the 2D-ADN shows good adaptability to measured data and favorable imaging results with a clear background can be obtained in a short time.

Download Full-text

New methods based on back propagation (BP) and radial basis function (RBF) artificial neural networks (ANNs) for predicting the occurrence of haloketones in tap water

The Science of The Total Environment ◽

10.1016/j.scitotenv.2021.145534 ◽

2021 ◽

Vol 772 ◽

pp. 145534 ◽

Cited By ~ 2

Author(s):

Ying Deng ◽

Xiaoling Zhou ◽

Jiao Shen ◽

Ge Xiao ◽

Huachang Hong ◽

...

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Radial Basis Function ◽

Basis Function ◽

Tap Water ◽

Back Propagation ◽

New Methods ◽

Radial Basis ◽

Artificial Neural

Download Full-text

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Collaborative Deep Learning for Medical Image Analysis with Differential Privacy

2019 IEEE Global Communications Conference (GLOBECOM) ◽

10.1109/globecom38437.2019.9014259 ◽

2019 ◽

Author(s):

Danni Yuan ◽

Xiaoyan Zhu ◽

Mingkui Wei ◽

Jianfeng Ma

Keyword(s):

Image Analysis ◽

Deep Learning ◽

Medical Image ◽

Differential Privacy ◽

Medical Image Analysis

Download Full-text