Generation of Synthetic Data with Conditional Generative Adversarial Networks

2020 ◽  
Author(s):  
Belén Vega-Márquez ◽  
Cristina Rubio-Escudero ◽  
Isabel Nepomuceno-Chamorro

Abstract The generation of synthetic data is becoming a fundamental task in the daily life of any organization due to the new protection data laws that are emerging. Because of the rise in the use of Artificial Intelligence, one of the most recent proposals to address this problem is the use of Generative Adversarial Networks (GANs). These types of networks have demonstrated a great capacity to create synthetic data with very good performance. The goal of synthetic data generation is to create data that will perform similarly to the original dataset for many analysis tasks, such as classification. The problem of GANs is that in a classification problem, GANs do not take class labels into account when generating new data, it is treated as any other attribute. This research work has focused on the creation of new synthetic data from datasets with different characteristics with a Conditional Generative Adversarial Network (CGAN). CGANs are an extension of GANs where the class label is taken into account when the new data is generated. The performance of our results has been measured in two different ways: firstly, by comparing the results obtained with classification algorithms, both in the original datasets and in the data generated; secondly, by checking that the correlation between the original data and those generated is minimal.

Author(s):  
Khaled ELKarazle ◽  
Valliappan Raman ◽  
Patrick Then

Age estimation models can be employed in many applications, including soft biometrics, content access control, targeted advertising, and many more. However, as some facial images are taken in unrestrained conditions, the quality relegates, which results in the loss of several essential ageing features. This study investigates how introducing a new layer of data processing based on a super-resolution generative adversarial network (SRGAN) model can influence the accuracy of age estimation by enhancing the quality of both the training and testing samples. Additionally, we introduce a novel convolutional neural network (CNN) classifier to distinguish between several age classes. We train one of our classifiers on a reconstructed version of the original dataset and compare its performance with an identical classifier trained on the original version of the same dataset. Our findings reveal that the classifier which trains on the reconstructed dataset produces better classification accuracy, opening the door for more research into building data-centric machine learning systems.


Author(s):  
Chaudhary Sarimurrab, Ankita Kesari Naman and Sudha Narang

The Generative Models have gained considerable attention in the field of unsupervised learning via a new and practical framework called Generative Adversarial Networks (GAN) due to its outstanding data generation capability. Many models of GAN have proposed, and several practical applications emerged in various domains of computer vision and machine learning. Despite GAN's excellent success, there are still obstacles to stable training. In this model, we aim to generate human faces through un-labelled data via the help of Deep Convolutional Generative Adversarial Networks. The applications for generating faces are vast in the field of image processing, entertainment, and other such industries. Our resulting model is successfully able to generate human faces from the given un-labelled data and random noise.


2018 ◽  
Vol 26 (3) ◽  
pp. 228-241 ◽  
Author(s):  
Mrinal Kanti Baowaly ◽  
Chia-Ching Lin ◽  
Chao-Lin Liu ◽  
Kuan-Ta Chen

AbstractObjectiveThe aim of this study was to generate synthetic electronic health records (EHRs). The generated EHR data will be more realistic than those generated using the existing medical Generative Adversarial Network (medGAN) method.Materials and MethodsWe modified medGAN to obtain two synthetic data generation models—designated as medical Wasserstein GAN with gradient penalty (medWGAN) and medical boundary-seeking GAN (medBGAN)—and compared the results obtained using the three models. We used 2 databases: MIMIC-III and National Health Insurance Research Database (NHIRD), Taiwan. First, we trained the models and generated synthetic EHRs by using these three 3 models. We then analyzed and compared the models’ performance by using a few statistical methods (Kolmogorov–Smirnov test, dimension-wise probability for binary data, and dimension-wise average count for count data) and 2 machine learning tasks (association rule mining and prediction).ResultsWe conducted a comprehensive analysis and found our models were adequately efficient for generating synthetic EHR data. The proposed models outperformed medGAN in all cases, and among the 3 models, boundary-seeking GAN (medBGAN) performed the best.DiscussionTo generate realistic synthetic EHR data, the proposed models will be effective in the medical industry and related research from the viewpoint of providing better services. Moreover, they will eliminate barriers including limited access to EHR data and thus accelerate research on medical informatics.ConclusionThe proposed models can adequately learn the data distribution of real EHRs and efficiently generate realistic synthetic EHRs. The results show the superiority of our models over the existing model.


2020 ◽  
Author(s):  
Tarik Alafif

Generative Adversarial Network (GAN) has made a breakthrough and great success in many research areas in computer vision. Different GANs generate different outputs. In this research work, we apply different GANs to generate handwritten Arabic characters. A basic GAN, Vanilla GAN, Deep Convolutional GAN (DCGAN), Bidirectional GAN (BiGAN), and Wasserstein GAN (WGAN) are used. Then, the results of the generated images are evaluated using native-Arabic human and Fréchet Inception Distance (FID). The qualitative and quantitative results are provided for the images generation and evaluation. In experimental evaluation, WGAN achieves better results in FID with a value of 96.007. On the other hand, DCGAN achieves better results in native-Arabic human evaluation with a value of 35%.


Actuators ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 86
Author(s):  
Jie Li ◽  
Boyu Zhao ◽  
Kai Wu ◽  
Zhicheng Dong ◽  
Xuerui Zhang ◽  
...  

Gear reliability assessment of vehicle transmission has been a challenging issue of determining vehicle safety in the transmission industry due to a significant amount of classification errors with high-coupling gear parameters and insufficient high-density data. In terms of the preprocessing of gear reliability assessment, this paper presents a representation generation approach based on generative adversarial networks (GAN) to advance the performance of reliability evaluation as a classification problem. First, with no need for complex modeling and massive calculations, a conditional generative adversarial net (CGAN) based model is established to generate gear representations through discovering inherent mapping between features with gear parameters and gear reliability. Instead of producing intact samples like other GAN techniques, the CGAN based model is designed to learn features of gear data. In this model, to raise the diversity of produced features, a mini-batch strategy of randomly sampling from the combination of raw and generated representations is used in the discriminator, instead of using all of the data features. Second, in order to overcome the unlabeled ability of CGAN, a Wasserstein labeling (WL) scheme is proposed to tag the created representations from our model for classification. Lastly, original and produced representations are fused to train classifiers. Experiments on real-world gear data from the industry indicate that the proposed approach outperforms other techniques on operational metrics.


Symmetry ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 734 ◽  
Author(s):  
Yan Ma ◽  
Kang Liu ◽  
Zhibin Guan ◽  
Xinkai Xu ◽  
Xu Qian ◽  
...  

Augmented Reality (AR) is crucial for immersive Human–Computer Interaction (HCI) and the vision of Artificial Intelligence (AI). Labeled data drives object recognition in AR. However, manually annotating data is expensive, labor-intensive, and data distribution asymmetry . Scantily labeled data limits the application of AR. Aiming at solving the problem of insufficient and asymmetry training data in AR object recognition, an automated vision data synthesis method, i.e., background augmentation generative adversarial networks (BAGANs), is proposed in this paper based on 3D modeling and the Generative Adversarial Network (GAN) algorithm. Our approach has been validated to have better performance than other methods through image recognition tasks with respect to the natural image database ObjectNet3D. This study can shorten the algorithm development time of AR and expand its application scope, which is of great significance for immersive interactive systems.


Author(s):  
Derek Reiman ◽  
Yang Dai

AbstractThe microbiome of the human body has been shown to have profound effects on physiological regulation and disease pathogenesis. However, association analysis based on statistical modeling of microbiome data has continued to be a challenge due to inherent noise, complexity of the data, and high cost of collecting large number of samples. To address this challenge, we employed a deep learning framework to construct a data-driven simulation of microbiome data using a conditional generative adversarial network. Conditional generative adversarial networks train two models against each other while leveraging side information learn from a given dataset to compute larger simulated datasets that are representative of the original dataset. In our study, we used a cohorts of patients with inflammatory bowel disease to show that not only can the generative adversarial network generate samples representative of the original data based on multiple diversity metrics, but also that training machine learning models on the synthetic samples can improve disease prediction through data augmentation. In addition, we also show that the synthetic samples generated by this cohort can boost disease prediction of a different external cohort.


2017 ◽  
Author(s):  
Benjamin Sanchez-Lengeling ◽  
Carlos Outeiral ◽  
Gabriel L. Guimaraes ◽  
Alan Aspuru-Guzik

Molecular discovery seeks to generate chemical species tailored to very specific needs. In this paper, we present ORGANIC, a framework based on Objective-Reinforced Generative Adversarial Networks (ORGAN), capable of producing a distribution over molecular space that matches with a certain set of desirable metrics. This methodology combines two successful techniques from the machine learning community: a Generative Adversarial Network (GAN), to create non-repetitive sensible molecular species, and Reinforcement Learning (RL), to bias this generative distribution towards certain attributes. We explore several applications, from optimization of random physicochemical properties to candidates for drug discovery and organic photovoltaic material design.


2021 ◽  
Vol 11 (15) ◽  
pp. 7034
Author(s):  
Hee-Deok Yang

Artificial intelligence technologies and vision systems are used in various devices, such as automotive navigation systems, object-tracking systems, and intelligent closed-circuit televisions. In particular, outdoor vision systems have been applied across numerous fields of analysis. Despite their widespread use, current systems work well under good weather conditions. They cannot account for inclement conditions, such as rain, fog, mist, and snow. Images captured under inclement conditions degrade the performance of vision systems. Vision systems need to detect, recognize, and remove noise because of rain, snow, and mist to boost the performance of the algorithms employed in image processing. Several studies have targeted the removal of noise resulting from inclement conditions. We focused on eliminating the effects of raindrops on images captured with outdoor vision systems in which the camera was exposed to rain. An attentive generative adversarial network (ATTGAN) was used to remove raindrops from the images. This network was composed of two parts: an attentive-recurrent network and a contextual autoencoder. The ATTGAN generated an attention map to detect rain droplets. A de-rained image was generated by increasing the number of attentive-recurrent network layers. We increased the number of visual attentive-recurrent network layers in order to prevent gradient sparsity so that the entire generation was more stable against the network without preventing the network from converging. The experimental results confirmed that the extended ATTGAN could effectively remove various types of raindrops from images.


2020 ◽  
pp. 1-13
Author(s):  
Yundong Li ◽  
Yi Liu ◽  
Han Dong ◽  
Wei Hu ◽  
Chen Lin

The intrusion detection of railway clearance is crucial for avoiding railway accidents caused by the invasion of abnormal objects, such as pedestrians, falling rocks, and animals. However, detecting intrusions using deep learning methods from infrared images captured at night remains a challenging task because of the lack of sufficient training samples. To address this issue, a transfer strategy that migrates daytime RGB images to the nighttime style of infrared images is proposed in this study. The proposed method consists of two stages. In the first stage, a data generation model is trained on the basis of generative adversarial networks using RGB images and a small number of infrared images, and then, synthetic samples are generated using a well-trained model. In the second stage, a single shot multibox detector (SSD) model is trained using synthetic data and utilized to detect abnormal objects from infrared images at nighttime. To validate the effectiveness of the proposed method, two groups of experiments, namely, railway and non-railway scenes, are conducted. Experimental results demonstrate the effectiveness of the proposed method, and an improvement of 17.8% is achieved for object detection at nighttime.


Sign in / Sign up

Export Citation Format

Share Document