scholarly journals Federated Learning with Sparsification-Amplified Privacy and Adaptive Optimization

Author(s):  
Rui Hu ◽  
Yanmin Gong ◽  
Yuanxiong Guo

Federated learning (FL) enables distributed agents to collaboratively learn a centralized model without sharing their raw data with each other. However, data locality does not provide sufficient privacy protection, and it is desirable to facilitate FL with rigorous differential privacy (DP) guarantee. Existing DP mechanisms would introduce random noise with magnitude proportional to the model size, which can be quite large in deep neural networks. In this paper, we propose a new FL framework with sparsification-amplified privacy. Our approach integrates random sparsification with gradient perturbation on each agent to amplify privacy guarantee. Since sparsification would increase the number of communication rounds required to achieve a certain target accuracy, which is unfavorable for DP guarantee, we further introduce acceleration techniques to help reduce the privacy cost. We rigorously analyze the convergence of our approach and utilize Renyi DP to tightly account the end-to-end DP guarantee. Extensive experiments on benchmark datasets validate that our approach outperforms previous differentially-private FL approaches in both privacy guarantee and communication efficiency.

Author(s):  
Chen Qi ◽  
Shibo Shen ◽  
Rongpeng Li ◽  
Zhifeng Zhao ◽  
Qing Liu ◽  
...  

AbstractNowadays, deep neural networks (DNNs) have been rapidly deployed to realize a number of functionalities like sensing, imaging, classification, recognition, etc. However, the computational-intensive requirement of DNNs makes it difficult to be applicable for resource-limited Internet of Things (IoT) devices. In this paper, we propose a novel pruning-based paradigm that aims to reduce the computational cost of DNNs, by uncovering a more compact structure and learning the effective weights therein, on the basis of not compromising the expressive capability of DNNs. In particular, our algorithm can achieve efficient end-to-end training that transfers a redundant neural network to a compact one with a specifically targeted compression rate directly. We comprehensively evaluate our approach on various representative benchmark datasets and compared with typical advanced convolutional neural network (CNN) architectures. The experimental results verify the superior performance and robust effectiveness of our scheme. For example, when pruning VGG on CIFAR-10, our proposed scheme is able to significantly reduce its FLOPs (floating-point operations) and number of parameters with a proportion of 76.2% and 94.1%, respectively, while still maintaining a satisfactory accuracy. To sum up, our scheme could facilitate the integration of DNNs into the common machine-learning-based IoT framework and establish distributed training of neural networks in both cloud and edge.


2021 ◽  
Vol 2050 (1) ◽  
pp. 012006
Author(s):  
Xili Dai ◽  
Chunmei Ma ◽  
Jingwei Sun ◽  
Tao Zhang ◽  
Haigang Gong ◽  
...  

Abstract Training deep neural networks from only a few examples has been an interesting topic that motivated few shot learning. In this paper, we study the fine-grained image classification problem in a challenging few-shot learning setting, and propose the Self-Amplificated Network (SAN), a method based on meta-learning to tackle this problem. The SAN model consists of three parts, which are the Encoder, Amplification and Similarity Modules. The Encoder Module encodes a fine-grained image input into a feature vector. The Amplification Module is used to amplify subtle differences between fine-grained images based on the self attention mechanism which is composed of multi-head attention. The Similarity Module measures how similar the query image and the support set are in order to determine the classification result. In-depth experiments on three benchmark datasets have showcased that our network achieves superior performance over the competing baselines.


2021 ◽  
Author(s):  
Jude TCHAYE-KONDI ◽  
Yanlong Zhai ◽  
Liehuang Zhu

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>


Author(s):  
Cunxiao Du ◽  
Zhaozheng Chen ◽  
Fuli Feng ◽  
Lei Zhu ◽  
Tian Gan ◽  
...  

Text classification is one of the fundamental tasks in natural language processing. Recently, deep neural networks have achieved promising performance in the text classification task compared to shallow models. Despite of the significance of deep models, they ignore the fine-grained (matching signals between words and classes) classification clues since their classifications mainly rely on the text-level representations. To address this problem, we introduce the interaction mechanism to incorporate word-level matching signals into the text classification task. In particular, we design a novel framework, EXplicit interAction Model (dubbed as EXAM), equipped with the interaction mechanism. We justified the proposed approach on several benchmark datasets including both multilabel and multi-class text classification tasks. Extensive experimental results demonstrate the superiority of the proposed method. As a byproduct, we have released the codes and parameter settings to facilitate other researches.


Author(s):  
Hongyu Guo ◽  
Yongyi Mao ◽  
Richong Zhang

MixUp (Zhang et al. 2017) is a recently proposed dataaugmentation scheme, which linearly interpolates a random pair of training examples and correspondingly the one-hot representations of their labels. Training deep neural networks with such additional data is shown capable of significantly improving the predictive accuracy of the current art. The power of MixUp, however, is primarily established empirically and its working and effectiveness have not been explained in any depth. In this paper, we develop an understanding for MixUp as a form of “out-of-manifold regularization”, which imposes certain “local linearity” constraints on the model’s input space beyond the data manifold. This analysis enables us to identify a limitation of MixUp, which we call “manifold intrusion”. In a nutshell, manifold intrusion in MixUp is a form of under-fitting resulting from conflicts between the synthetic labels of the mixed-up examples and the labels of original training data. Such a phenomenon usually happens when the parameters controlling the generation of mixing policies are not sufficiently fine-tuned on the training data. To address this issue, we propose a novel adaptive version of MixUp, where the mixing policies are automatically learned from the data using an additional network and objective function designed to avoid manifold intrusion. The proposed regularizer, AdaMixUp, is empirically evaluated on several benchmark datasets. Extensive experiments demonstrate that AdaMixUp improves upon MixUp when applied to the current art of deep classification models.


Author(s):  
Yunpeng Chen ◽  
Xiaojie Jin ◽  
Bingyi Kang ◽  
Jiashi Feng ◽  
Shuicheng Yan

The residual unit and its variations are wildly used in building very deep neural networks for alleviating optimization difficulty. In this work, we revisit the standard residual function as well as its several successful variants and propose a unified framework based on tensor Block Term Decomposition (BTD) to explain these apparently different residual functions from the tensor decomposition view. With the BTD framework, we further propose a novel basic network architecture, named the Collective Residual Unit (CRU). CRU further enhances parameter efficiency of deep residual neural networks by sharing core factors derived from collective tensor factorization over the involved residual units. It enables efficient knowledge sharing across multiple residual units, reduces the number of model parameters, lowers the risk of over-fitting, and provides better generalization ability. Extensive experimental results show that our proposed CRU network brings outstanding parameter efficiency -- it achieves comparable classification performance with ResNet-200 while using a model size as small as ResNet-50 on the ImageNet-1k and Places365-Standard benchmark datasets.


2017 ◽  
Vol 7 (3) ◽  
pp. 17-51 ◽  
Author(s):  
Cynthia Dwork ◽  
Frank McSherry ◽  
Kobbi Nissim ◽  
Adam Smith

We continue a line of research initiated in Dinur and Nissim (2003); Dwork and Nissim (2004); and Blum et al. (2005) on privacy-preserving statistical databases. Consider a trusted server that holds a database of sensitive information. Given a query function $f$ mapping databases to reals, the so-called {\em true answer} is the result of applying $f$ to the database. To protect privacy, the true answer is perturbed by the addition of random noise generated according to a carefully chosen distribution, and this response, the true answer plus noise, is returned to the user. Previous work focused on the case of noisy sums, in which $f = \sum_i g(x_i)$, where $x_i$ denotes the $i$th row of the database and $g$ maps database rows to $[0,1]$. We extend the study to general functions $f$, proving that privacy can be preserved by calibrating the standard deviation of the noise according to the {\em sensitivity} of the function $f$. Roughly speaking, this is the amount that any single argument to $f$ can change its output. The new analysis shows that for several particular applications substantially less noise is needed than was previously understood to be the case. The first step is a very clean definition of privacy---now known as differential privacy---and measure of its loss. We also provide a set of tools for designing and combining differentially private algorithms, permitting the construction of complex differentially private analytical tools from simple differentially private primitives. Finally, we obtain separation results showing the increased value of interactive statistical release mechanisms over non-interactive ones.


Author(s):  
Xuejing Lei ◽  
Ganning Zhao ◽  
Kaitai Zhang ◽  
C.-C. Jay Kuo

An explainable, efficient, and lightweight method for texture generation, called TGHop (an acronym of Texture Generation PixelHop), is proposed in this work. Although synthesis of visually pleasant texture can be achieved by deep neural networks, the associated models are large in size, difficult to explain in theory, and computationally expensive in training. In contrast, TGHop is small in its model size, mathematically transparent, efficient in training and inference, and able to generate high-quality texture. Given an exemplary texture, TGHop first crops many sample patches out of it to form a collection of sample patches called the source. Then, it analyzes pixel statistics of samples from the source and obtains a sequence of fine-to-coarse subspaces for these patches by using the PixelHop++ framework. To generate texture patches with TGHop, we begin with the coarsest subspace, which is called the core, and attempt to generate samples in each subspace by following the distribution of real samples. Finally, texture patches are stitched to form texture images of a large size. It is demonstrated by experimental results that TGHop can generate texture images of superior quality with a small model size and at a fast speed.


2020 ◽  
Author(s):  
Ishika Singh ◽  
Haoyi Zhou ◽  
Kunlin Yang ◽  
Meng Ding ◽  
Bill Lin ◽  
...  

Neural architecture search, which aims to automatically search for architectures (e.g., convolution, max pooling) of neural networks that maximize validation performance, has achieved remarkable progress recently. In many application scenarios, several parties would like to collaboratively search for a shared neural architecture by leveraging data from all parties. However, due to privacy concerns, no party wants its data to be seen by other parties. To address this problem, we propose federated neural architecture search (FNAS), where different parties collectively search for a differentiable architecture by exchanging gradients of architecture variables without exposing their data to other parties. To further preserve privacy, we study differentially-private FNAS (DP-FNAS), which adds random noise to the gradients of architecture variables. We provide theoretical guarantees of DP-FNAS in achieving differential privacy. Experiments show that DP-FNAS can search highly-performant neural architectures while protecting the privacy of individual parties. The code is available at https://github.com/UCSD-AI4H/DP-FNAS


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Fahad Alharbi ◽  
Khalil El Hindi ◽  
Saad Al Ahmadi ◽  
Hussien Alsalamn

Noise in training data increases the tendency of many machine learning methods to overfit the training data, which undermines the performance. Outliers occur in big data as a result of various factors, including human errors. In this work, we present a novel discriminator model for the identification of outliers in the training data. We propose a systematic approach for creating training datasets to train the discriminator based on a small number of genuine instances (trusted data). The noise discriminator is a convolutional neural network (CNN). We evaluate the discriminator’s performance using several benchmark datasets and with different noise ratios. We inserted random noise in each dataset and trained discriminators to clean them. Different discriminators were trained using different numbers of genuine instances with and without data augmentation. We compare the performance of the proposed noise-discriminator method with seven other methods proposed in the literature using several benchmark datasets. Our empirical results indicate that the proposed method is very competitive to the other methods. It actually outperforms them for pair noise.


Sign in / Sign up

Export Citation Format

Share Document