scholarly journals Information-Based Boundary Equilibrium Generative Adversarial Networks with Interpretable Representation Learning

2018 ◽  
Vol 2018 ◽  
pp. 1-14 ◽  
Author(s):  
Junghoon Hah ◽  
Woojin Lee ◽  
Jaewook Lee ◽  
Saerom Park

This paper describes a new image generation algorithm based on generative adversarial network. With an information-theoretic extension to the autoencoder-based discriminator, this new algorithm is able to learn interpretable representations from the input images. Our model not only adversarially minimizes the Wasserstein distance-based losses of the discriminator and generator but also maximizes the mutual information between small subset of the latent variables and the observation. We also train our model with proportional control theory to keep the equilibrium between the discriminator and the generator balanced, and as a result, our generative adversarial network can mitigate the convergence problem. Through the experiments on real images, we validate our proposed method, which can manipulate the generated images as desired by controlling the latent codes of input variables. In addition, the visual qualities of produced images are effectively maintained, and the model can stably converge to the equilibrium. However, our model has a difficulty in learning disentangling factors because our model does not regularize the independence between the interpretable factors. Therefore, in the future, we will develop a generative model that can learn disentangling factors.

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Makoto Naruse ◽  
Takashi Matsubara ◽  
Nicolas Chauvet ◽  
Kazutaka Kanno ◽  
Tianyu Yang ◽  
...  

Abstract Generative adversarial networks (GANs) are becoming increasingly important in the artificial construction of natural images and related functionalities, wherein two types of networks called generators and discriminators evolve through adversarial mechanisms. Using deep convolutional neural networks and related techniques, high-resolution and highly realistic scenes, human faces, etc. have been generated. GANs generally require large amounts of genuine training data sets, as well as vast amounts of pseudorandom numbers. In this study, we utilized chaotic time series generated experimentally by semiconductor lasers for the latent variables of a GAN, whereby the inherent nature of chaos could be reflected or transformed into the generated output data. We show that the similarity in proximity, which describes the robustness of the generated images with respect to minute changes in the input latent variables, is enhanced, while the versatility overall is not severely degraded. Furthermore, we demonstrate that the surrogate chaos time series eliminates the signature of the generated images that is originally observed corresponding to the negative autocorrelation inherent in the chaos sequence. We also address the effects of utilizing chaotic time series to retrieve images from the trained generator.


Electronics ◽  
2020 ◽  
Vol 9 (4) ◽  
pp. 688
Author(s):  
Sung-Wook Park ◽  
Jun-Ho Huh ◽  
Jong-Chan Kim

In the field of deep learning, the generative model did not attract much attention until GANs (generative adversarial networks) appeared. In 2014, Google’s Ian Goodfellow proposed a generative model called GANs. GANs use different structures and objective functions from the existing generative model. For example, GANs use two neural networks: a generator that creates a realistic image, and a discriminator that distinguishes whether the input is real or synthetic. If there are no problems in the training process, GANs can generate images that are difficult even for experts to distinguish in terms of authenticity. Currently, GANs are the most researched subject in the field of computer vision, which deals with the technology of image style translation, synthesis, and generation, and various models have been unveiled. The issues raised are also improving one by one. In image synthesis, BEGAN (Boundary Equilibrium Generative Adversarial Network), which outperforms the previously announced GANs, learns the latent space of the image, while balancing the generator and discriminator. Nonetheless, BEGAN also has a mode collapse wherein the generator generates only a few images or a single one. Although BEGAN-CS (Boundary Equilibrium Generative Adversarial Network with Constrained Space), which was improved in terms of loss function, was introduced, it did not solve the mode collapse. The discriminator structure of BEGAN-CS is AE (AutoEncoder), which cannot create a particularly useful or structured latent space. Compression performance is not good either. In this paper, this characteristic of AE is considered to be related to the occurrence of mode collapse. Thus, we used VAE (Variational AutoEncoder), which added statistical techniques to AE. As a result of the experiment, the proposed model did not cause mode collapse but converged to a better state than BEGAN-CS.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kazuma Kokomoto ◽  
Rena Okawa ◽  
Kazuhiko Nakano ◽  
Kazunori Nozaki

AbstractDentists need experience with clinical cases to practice specialized skills. However, the need to protect patient's private information limits their ability to utilize intraoral images obtained from clinical cases. In this study, since generating realistic images could make it possible to utilize intraoral images, progressive growing of generative adversarial networks are used to generate intraoral images. A total of 35,254 intraoral images were used as training data with resolutions of 128 × 128, 256 × 256, 512 × 512, and 1024 × 1024. The results of the training datasets with and without data augmentation were compared. The Sliced Wasserstein Distance was calculated to evaluate the generated images. Next, 50 real images and 50 generated images for each resolution were randomly selected and shuffled. 12 pediatric dentists were asked to observe these images and assess whether they were real or generated. The d prime of the 1024 × 1024 images was significantly higher than that of the other resolutions. In conclusion, generated intraoral images with resolutions of 512 × 512 or lower were so realistic that the dentists could not distinguish whether they were real or generated. This implies that the generated images can be used in dental education or data augmentation for deep learning, without privacy restrictions.


Author(s):  
Sirajul Salekin ◽  
Milad Mostavi ◽  
Yu-Chiao Chiu ◽  
Yidong Chen ◽  
Jianqiu (Michelle) Zhang ◽  
...  

ABSTRACTEpitranscriptome is an exciting area that studies different types of modifications in transcripts and the prediction of such modification sites from the transcript sequence is of significant interest. However, the scarcity of positive sites for most modifications imposes critical challenges for training robust algorithms. To circumvent this problem, we propose MR-GAN, a generative adversarial network (GAN) based model, which is trained in an unsupervised fashion on the entire pre-mRNA sequences to learn a low dimensional embedding of transcriptomic sequences. MR-GAN was then applied to extract embeddings of the sequences in a training dataset we created for eight epitranscriptome modifications, including m6A, m1A, m1G, m2G, m5C, m5U, 2′-O-Me, Pseudouridine (Ψ) and Dihydrouridine (D), of which the positive samples are very limited. Prediction models were trained based on the embeddings extracted by MR-GAN. We compared the prediction performance with the one-hot encoding of the training sequences and SRAMP, a state-of-the-art m6A site prediction algorithm and demonstrated that the learned embeddings outperform one-hot encoding by a significant margin for up to 15% improvement. Using MR-GAN, we also investigated the sequence motifs for each modification type and uncovered known motifs as well as new motifs not possible with sequences directly. The results demonstrated that transcriptome features extracted using unsupervised learning could lead to high precision for predicting multiple types of epitranscriptome modifications, even when the data size is small and extremely imbalanced.


2021 ◽  
Author(s):  
SHANTANU GHOSH ◽  
Zheng Feng ◽  
Jiang Bian ◽  
Kevin Butler ◽  
Mattia Prosperi

Abstract Determining causal effects of interventions onto outcomes from observational (non-randomized) data, e.g., treatment repurposing using electronic health records, is challenging due to underlying bias. Causal deep learning has improved over traditional techniques for estimating individualized treatment effects. We present the Doubly Robust Variational Information-theoretic Deep Adversarial Learning (DR-VIDAL), a novel generative framework that combines two joint models of treatment and outcome, ensuring an unbiased estimation even when one of the two is misspecified. DR-VIDAL uses a variational autoencoder (VAE) to factorize confounders into latent variables according to causal assumptions; then, an information-theoretic generative adversarial network (Info-GAN) is used to generate counterfactuals; finally, a doubly robust block incorporates propensity matching/weighting into predictions. On synthetic and real-world datasets, DR-VIDAL achieves better performance than other non-generative and generative methods. In conclusion, DR-VIDAL uniquely fuses causal assumptions, VAE, Info-GAN, and doubly robustness into a comprehensive, performant framework. Code is available at: https://bitbucket.org/goingdeep2406/dr-vidal/src/master/


Electronics ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 220
Author(s):  
Chunxue Wu ◽  
Haiyan Du ◽  
Qunhui Wu ◽  
Sheng Zhang

In the automatic sorting process of express delivery, a three-segment code is used to represent a specific area assigned by a specific delivery person. In the process of obtaining the courier order information, the camera is affected by factors such as light, noise, and subject shake, which will cause the information on the courier order to be blurred, and some information will be lost. Therefore, this paper proposes an image text deblurring method based on a generative adversarial network. The model of the algorithm consists of two generative adversarial networks, combined with Wasserstein distance, using a combination of adversarial loss and perceptual loss on unpaired datasets to train the network model to restore the captured blurred images into clear and natural image. Compared with the traditional method, the advantage of this method is that the loss function between the input and output images can be calculated indirectly through the positive and negative generative adversarial networks. The Wasserstein distance can achieve a more stable training process and a more realistic generation effect. The constraints of adversarial loss and perceptual loss make the model capable of training on unpaired datasets. The experimental results on the GOPRO test dataset and the self-built unpaired dataset showed that the two indicators, peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM), increased by 13.3% and 3%, respectively. The human perception test results demonstrated that the algorithm proposed in this paper was better than the traditional blur algorithm as the deblurring effect was better.


2017 ◽  
Author(s):  
Benjamin Sanchez-Lengeling ◽  
Carlos Outeiral ◽  
Gabriel L. Guimaraes ◽  
Alan Aspuru-Guzik

Molecular discovery seeks to generate chemical species tailored to very specific needs. In this paper, we present ORGANIC, a framework based on Objective-Reinforced Generative Adversarial Networks (ORGAN), capable of producing a distribution over molecular space that matches with a certain set of desirable metrics. This methodology combines two successful techniques from the machine learning community: a Generative Adversarial Network (GAN), to create non-repetitive sensible molecular species, and Reinforcement Learning (RL), to bias this generative distribution towards certain attributes. We explore several applications, from optimization of random physicochemical properties to candidates for drug discovery and organic photovoltaic material design.


2021 ◽  
Vol 11 (15) ◽  
pp. 7034
Author(s):  
Hee-Deok Yang

Artificial intelligence technologies and vision systems are used in various devices, such as automotive navigation systems, object-tracking systems, and intelligent closed-circuit televisions. In particular, outdoor vision systems have been applied across numerous fields of analysis. Despite their widespread use, current systems work well under good weather conditions. They cannot account for inclement conditions, such as rain, fog, mist, and snow. Images captured under inclement conditions degrade the performance of vision systems. Vision systems need to detect, recognize, and remove noise because of rain, snow, and mist to boost the performance of the algorithms employed in image processing. Several studies have targeted the removal of noise resulting from inclement conditions. We focused on eliminating the effects of raindrops on images captured with outdoor vision systems in which the camera was exposed to rain. An attentive generative adversarial network (ATTGAN) was used to remove raindrops from the images. This network was composed of two parts: an attentive-recurrent network and a contextual autoencoder. The ATTGAN generated an attention map to detect rain droplets. A de-rained image was generated by increasing the number of attentive-recurrent network layers. We increased the number of visual attentive-recurrent network layers in order to prevent gradient sparsity so that the entire generation was more stable against the network without preventing the network from converging. The experimental results confirmed that the extended ATTGAN could effectively remove various types of raindrops from images.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yingxi Yang ◽  
Hui Wang ◽  
Wen Li ◽  
Xiaobo Wang ◽  
Shizhao Wei ◽  
...  

Abstract Background Protein post-translational modification (PTM) is a key issue to investigate the mechanism of protein’s function. With the rapid development of proteomics technology, a large amount of protein sequence data has been generated, which highlights the importance of the in-depth study and analysis of PTMs in proteins. Method We proposed a new multi-classification machine learning pipeline MultiLyGAN to identity seven types of lysine modified sites. Using eight different sequential and five structural construction methods, 1497 valid features were remained after the filtering by Pearson correlation coefficient. To solve the data imbalance problem, Conditional Generative Adversarial Network (CGAN) and Conditional Wasserstein Generative Adversarial Network (CWGAN), two influential deep generative methods were leveraged and compared to generate new samples for the types with fewer samples. Finally, random forest algorithm was utilized to predict seven categories. Results In the tenfold cross-validation, accuracy (Acc) and Matthews correlation coefficient (MCC) were 0.8589 and 0.8376, respectively. In the independent test, Acc and MCC were 0.8549 and 0.8330, respectively. The results indicated that CWGAN better solved the existing data imbalance and stabilized the training error. Alternatively, an accumulated feature importance analysis reported that CKSAAP, PWM and structural features were the three most important feature-encoding schemes. MultiLyGAN can be found at https://github.com/Lab-Xu/MultiLyGAN. Conclusions The CWGAN greatly improved the predictive performance in all experiments. Features derived from CKSAAP, PWM and structure schemes are the most informative and had the greatest contribution to the prediction of PTM.


Electronics ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 1349
Author(s):  
Stefan Lattner ◽  
Javier Nistal

Lossy audio codecs compress (and decompress) digital audio streams by removing information that tends to be inaudible in human perception. Under high compression rates, such codecs may introduce a variety of impairments in the audio signal. Many works have tackled the problem of audio enhancement and compression artifact removal using deep-learning techniques. However, only a few works tackle the restoration of heavily compressed audio signals in the musical domain. In such a scenario, there is no unique solution for the restoration of the original signal. Therefore, in this study, we test a stochastic generator of a Generative Adversarial Network (GAN) architecture for this task. Such a stochastic generator, conditioned on highly compressed musical audio signals, could one day generate outputs indistinguishable from high-quality releases. Therefore, the present study may yield insights into more efficient musical data storage and transmission. We train stochastic and deterministic generators on MP3-compressed audio signals with 16, 32, and 64 kbit/s. We perform an extensive evaluation of the different experiments utilizing objective metrics and listening tests. We find that the models can improve the quality of the audio signals over the MP3 versions for 16 and 32 kbit/s and that the stochastic generators are capable of generating outputs that are closer to the original signals than those of the deterministic generators.


Sign in / Sign up

Export Citation Format

Share Document