scholarly journals Local Data Debiasing for Fairness Based on Generative Adversarial Training

Algorithms ◽  
2021 ◽  
Vol 14 (3) ◽  
pp. 87
Author(s):  
Ulrich Aïvodji ◽  
François Bidet ◽  
Sébastien Gambs ◽  
Rosin Claude Ngueveu ◽  
Alain Tapp

The widespread use of automated decision processes in many areas of our society raises serious ethical issues with respect to the fairness of the process and the possible resulting discrimination. To solve this issue, we propose a novel adversarial training approach called GANSan for learning a sanitizer whose objective is to prevent the possibility of any discrimination (i.e., direct and indirect) based on a sensitive attribute by removing the attribute itself as well as the existing correlations with the remaining attributes. Our method GANSan is partially inspired by the powerful framework of generative adversarial networks (in particular Cycle-GANs), which offers a flexible way to learn a distribution empirically or to translate between two different distributions. In contrast to prior work, one of the strengths of our approach is that the sanitization is performed in the same space as the original data by only modifying the other attributes as little as possible, thus preserving the interpretability of the sanitized data. Consequently, once the sanitizer is trained, it can be applied to new data locally by an individual on their profile before releasing it. Finally, experiments on real datasets demonstrate the effectiveness of the approach as well as the achievable trade-off between fairness and utility.

2020 ◽  
Vol 34 (07) ◽  
pp. 10909-10916
Author(s):  
Ligong Han ◽  
Ruijiang Gao ◽  
Mun Kim ◽  
Xin Tao ◽  
Bo Liu ◽  
...  

Conditional generative adversarial networks have shown exceptional generation performance over the past few years. However, they require large numbers of annotations. To address this problem, we propose a novel generative adversarial network utilizing weak supervision in the form of pairwise comparisons (PC-GAN) for image attribute editing. In the light of Bayesian uncertainty estimation and noise-tolerant adversarial training, PC-GAN can estimate attribute rating efficiently and demonstrate robust performance in noise resistance. Through extensive experiments, we show both qualitatively and quantitatively that PC-GAN performs comparably with fully-supervised methods and outperforms unsupervised baselines. Code and Supplementary can be found on the project website*.


Author(s):  
Chaowei Xiao ◽  
Bo Li ◽  
Jun-yan Zhu ◽  
Warren He ◽  
Mingyan Liu ◽  
...  

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial exam- ples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply Adv- GAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.


Author(s):  
Yao Ni ◽  
Dandan Song ◽  
Xi Zhang ◽  
Hao Wu ◽  
Lejian Liao

Generative adversarial networks (GANs) have shown impressive results, however, the generator and the discriminator are optimized in finite parameter space which means their performance still need to be improved. In this paper, we propose a novel approach of adversarial training between one generator and an exponential number of critics which are sampled from the original discriminative neural network via dropout. As discrepancy between outputs of different sub-networks of a same sample can measure the consistency of these critics, we encourage the critics to be consistent to real samples and inconsistent to generated samples during training, while the generator is trained to generate consistent samples for different critics. Experimental results demonstrate that our method can obtain state-of-the-art Inception scores of 9.17 and 10.02 on supervised CIFAR-10 and unsupervised STL-10 image generation tasks, respectively, as well as achieve competitive semi-supervised classification results on several benchmarks. Importantly, we demonstrate that our method can maintain stability in training and alleviate mode collapse.


Author(s):  
Ly Vu ◽  
Quang Uy Nguyen

Machine learning-based intrusion detection hasbecome more popular in the research community thanks to itscapability in discovering unknown attacks. To develop a gooddetection model for an intrusion detection system (IDS) usingmachine learning, a great number of attack and normal datasamples are required in the learning process. While normaldata can be relatively easy to collect, attack data is muchrarer and harder to gather. Subsequently, IDS datasets areoften dominated by normal data and machine learning modelstrained on those imbalanced datasets are ineffective in detect-ing attacks. In this paper, we propose a novel solution to thisproblem by using generative adversarial networks to generatesynthesized attack data for IDS. The synthesized attacks aremerged with the original data to form the augmented dataset.Three popular machine learning techniques are trained on theaugmented dataset. The experiments conducted on the threecommon IDS datasets and one our own dataset show thatmachine learning algorithms achieve better performance whentrained on the augmented dataset of the generative adversarialnetworks compared to those trained on the original datasetand other sampling techniques. The visualization techniquewas also used to analyze the properties of the synthesizeddata of the generative adversarial networks and the others.


Entropy ◽  
2020 ◽  
Vol 22 (4) ◽  
pp. 410 ◽  
Author(s):  
Likun Cai ◽  
Yanjie Chen ◽  
Ning Cai ◽  
Wei Cheng ◽  
Hao Wang

Generative Adversarial Nets (GANs) are one of the most popular architectures for image generation, which has achieved significant progress in generating high-resolution, diverse image samples. The normal GANs are supposed to minimize the Kullback–Leibler divergence between distributions of natural and generated images. In this paper, we propose the Alpha-divergence Generative Adversarial Net (Alpha-GAN) which adopts the alpha divergence as the minimization objective function of generators. The alpha divergence can be regarded as a generalization of the Kullback–Leibler divergence, Pearson χ 2 divergence, Hellinger divergence, etc. Our Alpha-GAN employs the power function as the form of adversarial loss for the discriminator with two-order indexes. These hyper-parameters make our model more flexible to trade off between the generated and target distributions. We further give a theoretical analysis of how to select these hyper-parameters to balance the training stability and the quality of generated images. Extensive experiments of Alpha-GAN are performed on SVHN and CelebA datasets, and evaluation results show the stability of Alpha-GAN. The generated samples are also competitive compared with the state-of-the-art approaches.


2019 ◽  
Vol 879 ◽  
pp. 217-254 ◽  
Author(s):  
Sangseung Lee ◽  
Donghyun You

Unsteady flow fields over a circular cylinder are used for training and then prediction using four different deep learning networks: generative adversarial networks with and without consideration of conservation laws; and convolutional neural networks with and without consideration of conservation laws. Flow fields at future occasions are predicted based on information on flow fields at previous occasions. Predictions of deep learning networks are made for flow fields at Reynolds numbers that were not used during training. Physical loss functions are proposed to explicitly provide information on conservation of mass and momentum to deep learning networks. An adversarial training is applied to extract features of flow dynamics in an unsupervised manner. Effects of the proposed physical loss functions and adversarial training on predicted results are analysed. Captured and missed flow physics from predictions are also analysed. Predicted flow fields using deep learning networks are in good agreement with flow fields computed by numerical simulations.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 389
Author(s):  
Esteban Piacentino ◽  
Alvaro Guarner ◽  
Cecilio Angulo

In personalized healthcare, an ecosystem for the manipulation of reliable and safe private data should be orchestrated. This paper describes an approach for the generation of synthetic electrocardiograms (ECGs) based on Generative Adversarial Networks (GANs) with the objective of anonymizing users’ information for privacy issues. This is intended to create valuable data that can be used both in educational and research areas, while avoiding the risk of a sensitive data leakage. As GANs are mainly exploited on images and video frames, we are proposing general raw data processing after transformation into an image, so it can be managed through a GAN, then decoded back to the original data domain. The feasibility of our transformation and processing hypothesis is primarily demonstrated. Next, from the proposed procedure, main drawbacks for each step in the procedure are addressed for the particular case of ECGs. Hence, a novel research pathway on health data anonymization using GANs is opened and further straightforward developments are expected.


2020 ◽  
Author(s):  
Belén Vega-Márquez ◽  
Cristina Rubio-Escudero ◽  
Isabel Nepomuceno-Chamorro

Abstract The generation of synthetic data is becoming a fundamental task in the daily life of any organization due to the new protection data laws that are emerging. Because of the rise in the use of Artificial Intelligence, one of the most recent proposals to address this problem is the use of Generative Adversarial Networks (GANs). These types of networks have demonstrated a great capacity to create synthetic data with very good performance. The goal of synthetic data generation is to create data that will perform similarly to the original dataset for many analysis tasks, such as classification. The problem of GANs is that in a classification problem, GANs do not take class labels into account when generating new data, it is treated as any other attribute. This research work has focused on the creation of new synthetic data from datasets with different characteristics with a Conditional Generative Adversarial Network (CGAN). CGANs are an extension of GANs where the class label is taken into account when the new data is generated. The performance of our results has been measured in two different ways: firstly, by comparing the results obtained with classification algorithms, both in the original datasets and in the data generated; secondly, by checking that the correlation between the original data and those generated is minimal.


2019 ◽  
Vol 38 (9) ◽  
pp. 698-705
Author(s):  
Ping Lu ◽  
Yuan Xiao ◽  
Yanyan Zhang ◽  
Nikolaos Mitsakos

A deep-learning-based compressive-sensing technique for reconstruction of missing seismic traces is introduced. The agility of the proposed approach lies in its ability to perfectly resolve the optimization limitation of conventional algorithms that solve inversion problems. It demonstrates how deep generative adversarial networks, equipped with an appropriate loss function that essentially leverages the distribution of the entire survey, can serve as an alternative approach for tackling compressive-sensing problems with high precision and in a computationally efficient manner. The method can be applied on both prestack and poststack seismic data, allowing for superior imaging quality with well-preconditioned and well-sampled field data, during the processing stage. To validate the robustness of the proposed approach on field data, the extent to which amplitudes and phase variations in original data are faithfully preserved is established, while subsurface consistency is also achieved. Several applications to acquisition and processing, such as decreasing bin size, increasing offset and azimuth sampling, or increasing the fold, can directly and immediately benefit from adopting the proposed technique. Furthermore, interpolation based on generative adversarial networks has been found to produce better-sampled data sets, with stronger regularization and attenuated aliasing phenomenon, while providing greater fidelity on steep-dip events and amplitude-variation-with-offset analysis with migration.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Andrew E. Blanchard ◽  
Christopher Stanley ◽  
Debsindhu Bhowmik

AbstractThe process of drug discovery involves a search over the space of all possible chemical compounds. Generative Adversarial Networks (GANs) provide a valuable tool towards exploring chemical space and optimizing known compounds for a desired functionality. Standard approaches to training GANs, however, can result in mode collapse, in which the generator primarily produces samples closely related to a small subset of the training data. In contrast, the search for novel compounds necessitates exploration beyond the original data. Here, we present an approach to training GANs that promotes incremental exploration and limits the impacts of mode collapse using concepts from Genetic Algorithms. In our approach, valid samples from the generator are used to replace samples from the training data. We consider both random and guided selection along with recombination during replacement. By tracking the number of novel compounds produced during training, we show that updates to the training data drastically outperform the traditional approach, increasing potential applications for GANs in drug discovery.


Sign in / Sign up

Export Citation Format

Share Document